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Foreword 



This volume contains the papers presented at the 4th International Workshop 
on Approximation Algorithms for Combinatorial Optimization Problems (AP- 
PROX’Ol) and the 5th International Workshop on Randomization and Approxi- 
mation Techniques in Computer Science (RANDOM’Ol), which took place con- 
currently at the University of California, Berkeley, from August 18-20, 2001. 
APPROX focuses on algorithmic and complexity issues surrounding the develop- 
ment of efficient approximate solutions to computationally hard problems, and is 
the fourth in the series after Aalborg (1998), Berkeley (1999) and Saarbriicken 
(2000). RANDOM is concerned with applications of randomness to computa- 
tional and combinatorial problems, and is the fifth workshop in the series fol- 
lowing Bologna (1997), Barcelona (1998), Berkeley (1999) and Geneva (2000). 

Topics of interest for APPROX and RANDOM are: design and analysis of 
approximation algorithms, inapproximability results, on-line problems, random- 
ization and de-randomization techniques, sources of randomness, average-case 
analysis, approximation classes, randomized complexity theory, scheduling prob- 
lems, routing and flow problems, coloring and partitioning, cuts and connectivity, 
packing and covering, geometric problems, network design, and various applica- 
tions. 

The volume contains 14+11 contributed papers, selected by the two program 
committees from 34+20 submissions received in response to the call for papers, 
together with abstracts of invited lectures by Michel Goemans (MIT), Russell 
Impagliazzo (San Diego), Anna Karlin (Washington), Luca Trevisan (Berkeley), 
and Salil Vadlran (MIT-Harvard) . 

We would like to thank all of the authors who submitted papers, our invited 
speakers, the members of the program committees 
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Using Complex Semidefinite Programming 
for Approximating MAX E2-LIN3 



Michel X. Goemans 

MIT, Dept, of Mathematics, Room 2-351, Cambridge, MA 02139, 

goemansOmath . mit . edu 



Abstract. A number of recent papers on approximation algorithms 
have used the square roots of unity, —1 and 1 to represent binary decision 
variables for problems in combinatorial optimization, and have relaxed 
these to unit vectors in real space using semidefinite programming in 
order to obtain near optimal solutions to these problems. In this talk, we 
consider using the cube roots of unity, 1, , and e l47r ^ 3 , to represent 

ternary decision variables for problems in combinatorial optimization. 

Here the natural relaxation is that of unit vectors in complex space. 

We use an extension of semidefinite programming to complex space to 
solve the natural relaxation, and use a natural extension of the random 
hyperplane technique to obtain near-optimal solutions to the problems. 

In particular, we consider the problem of maximizing the total weight of 
satisfied equations x u — x v = c (mod 3) and inequations x u — x v ^ c 
(mod 3), where x u £ {0, 1, 2} for all u. This problem can be used to model 
the MAX 3-CUT problem and a directed variant we call MAX 3-DICUT. 

For the general problem, we obtain a 0.79373-approximation algorithm. If 
the instance contains only inequations (as it does for MAX 3-CUT), we 
obtain a performance guarantee of pj + arccos 2 (— 1/4) « 0.83601. 
Although quite different at first glance, our relaxation and algorithm 
appear to be equivalent to those of Frieze and Jerrum (1997) and de 
Klerk, Pasechnik, and Warners (2000) for MAX 3-CUT, and the ones of 
Andersson, Engebretson, and Hastad (1999) for the general case. 

This talk is based on a joint result with David Williamson, to appear in 

[ 11 - 
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Hill-Climbing vs. Simulated Annealing 
for Planted Bisection Problems 



Russell Impagliazzo* 

Computer Science and Engineering, 

UCSD 9500 Gilman Drive, La Jolla, CA 92093-0114 
tcarson. russellOcs .ucsd.edu 



While knowing a problem is NP - complete tells us something about a problem’s 
worst-case complexity, it tells us little about how intractible specific distributions 
of instances really are, whether these distributions are mathematically defined 
or come from real-world applications. Frequently, iVP-complete problems have 
been successfully attacked on “typical” instances using heuristic methods. Little 
is known about when or why some of these heuristics succeed. 

An interesting class of heuristics are local search algorithms, a group that 
includes hill-climbing, Metropolis, simulated annealing, tabu-search, WalkSAT, 
etc. These methods are characterized by implicitly defining a search graph on 
possible solutions to an optimization problem and using some (often random- 
ized) method for moving along the edges of this graph in search of good quality 
solutions. Of course, assuming P # NP, no such method will always succeed in 
quickly finding optimial solutions for NP-hard problems. However, many such 
methods have been successfully used in practice for different classes of NP-hard 
optimization problems. 

While a large amount of effort has gone into both theoretical and experimen- 
tal studies of such heuristics, large gaps in our knowledge remain. For example, 
it is not clear whether one of the methods is universally preferable to another, or 
whether all of the above methods are incomparable. Do some methods succeed 
where others fail? Or is one of the methods strictly better than the others, for 
all interesting problem domains? 

These questions seem difficult to answer theoretically; there are very few 
successful analyses of local search heuristics for specific classes of problems, and 
even fewer comparisons of different methods. In fact, no natural examples of 
optimization problems where one method provably succeeds and another fails 
are known. 

It is just as difficult to tackle these questions experimentally, because each 
general method has a large number of variations and parameters, and success 
seems quite sensitive to the details in implentation. While experiments showing 
a method succeeds are not uncommon, experimental studies showing that a 

* Research Supported by NSF Award CCR-9734880, NFS Award CCR-9734911, grant 
#93025 of the joint US-Czechoslovak Science and Technology Program, and USA- 
Israel BSF Grant 97-001883 
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method fails or comparing two methods are both rare and hard to interpret. 
Does the method fail because it is by nature ill-suited to the problem domain, 
or because the implementation or parameters chosen were not optimized? 

This talk will summarize joint work with Ted Carson addressing these ques- 
tions, both theoretically and experimentally (EH). In this talk, we concen- 
trate on one AbP-complete problem, the minimum graph bisection problem, 
and one class of distributions for instances: the planted bisection graph model 
(H). The planted bisection random graph model has been used as a benchmark 
for evaluating heuristics Q1I2I9I8I11I7B6] ). In this model, a graph is constructed 
from two initially disjoint n node random graphs, drawn from G„ iP , by adding 
edges between cross graph vertex pairs with some lower probability q < p. If 
p — q = u)(y/l ogny/ mn/n 2 ), then with high probability the bisection separating 
the two is the unique only optimal solution. 

A landmark paper of Jerrum and Sorken proves that the Metropolis algo- 
rithm succeeds with high probability for random instances of the graph bisection 
problem drawn from the planted bisection model G n ^ Ptqi when p is sufficiently 
greater than q. They proved their result for p — q = n -1 / 6+£ , significantly greater 
than needed for optimality. This is one of the few optimization problems for 
which Metropolis or simulated annealing are provably good. However, they left 
open the question of whether the full Metropolis algorithm was necessary, or 
whether degenerate cases of the algorithm such as random hill-climbing would 
also succeed. 

We were expecting that these simpler heuristics would fail on this problem. 
However, our initial experimental work showed that hill-climbing methods suc- 
ceeded at finding the planted bisection not just in the range of parameters above, 
but whenever the planted bisection was optimal. 

Based on intuition gathered from these experiments, and some ideas from 
[6], we proved that a simple, polynomial-time, hill-climbing algorithm for this 
problem succeeds in finding the planted bisection with high probability if p—q = 
fi (Zog 3 n -1 / 2 ), within polylog factors of the threshold ( [3j ) . The above algorithm 
had one unnatural restriction. However, the same analysis shows a purely ran- 
domized hill-climbing algorithm succeeds in finding the planted bisection in poly- 
nomial time if p — q = l?(n _1 / 4+e \/mn), for any e > 0. This universal algorithm 
is a degenerate case of both Metropolis and go-with-the-winners, so this result 
implies, extends, and unifies those by Jerrum and Sorken, Dimitriou and Im- 
pagliazzo, and Juels mm- 

Thus, there are no examples of planted graph bisections where sophisticated 
heuristic methods have been proven to work, but where simple hill-climbing 
algorithms fail. This result emphasises the need to find instance distributions 
for optimization problems that can be used to discriminate between local search 
heuristic techniques. 

Returning to experimental results, we identified a candidate for such a dis- 
criminatory problem class. Namely, there exist parameters slightly below the 
threshold where the planted bisection is not optimal, but all “good” quality so- 
lutions are “near” the planted bisection. For these parameters, we were able to 
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distinguish experimentally between various heuristic methods. We were able to 
find parameters where local optimization typically failed to find any good solu- 
tion, but where an appropriate simulated annealing schedule finds solutions of 
near-optimal quality. We also showed experimentally that our simulated anneal- 
ing performed better than Metropolis at any fixed temperature. ( 15131 ). 

To do these experiments, we first gathered statistics using a Go-with-the- 
winners sampling algorithm. This allowed us to characterize “random” bisections 
of different qualities, and to identify biases introduced by optimization methods. 
In particular, we showed that hill-climbing methods produce solutions that are 
overly locally-optimized, in the sense that they were smaller in cut than would 
be typical for their distance from the planted bisection. For sufficiently high 
temperatures, Metropolis avoids this type of bias, but for lower temperatures, 
it produces a similar bias. Thus, there is an optimal Metropolis temperature 
for these distributions; above this temperature, Metropolis seems to converge 
rapidly to its stationary distribution, but below this temperature it seems to 
reach locally optima and become stuck. 

Even for the lowest temperature for which it has rapid convergence, Metropo- 
lis does not produce optimal solutions. On the other hand, the solutions it does 
produce at this temperature are significantly closer to the planted bisection. 
Starting a second, more greedy, phase of optimization from the results of the 
first allow further progress without introducing bias. Repeating this for a few 
steps leads to a cooling schedule for Simulated Annealing that significantly im- 
proves on Metropolis at any fixed temperature. 

We hope that in future work this experimentally observed gap will be rigor- 
ously proven. This would give the first natural proven separation between Sim- 
ulated Annealing, Metropolis and hill-climbing algorithms. More importantly, it 
would give insight into when and why some heuristic methods do better than 
others. 
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Anna R. Karlin 
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University of Washington 
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Abstract. We present a probabilistic generative model for web search 
which captures in a unified manner three critical components of web 
search: how the link structure of the web is generated, how the content 
of a web document is generated, and how a human searcher generates 
a query. The key to this unification lies in capturing the correlations 
between each of these components in terms of proximity in latent se- 
mantic space. Given such a combined model, the correct answer to a 
search query is well defined, and thus it becomes possible to evaluate 
web search algorithms rigorously. We present a new web search algo- 
rithm, based on spectral techniques, and prove that it is guaranteed to 
produce an approximately correct answer in our model. The algorithm 
assumes no knowledge of the model, and is well-defined regardless of the 
accuracy of the model. 

Joint work with D. Achlioptas, A. Fiat and F. McSherry 
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Error-Correcting Codes 
and Pseudorandom Projections 



Luca Trevisan* 

U.C. Berkeley, Computer Science Division, 615 Soda Hall, Berkeley, CA 

lucaOeecs . berkeley . edu 



Abstract. In this talk we discuss constructions of hash functions, ran- 
domness extractors, pseudorandom generators and hitting set generators 
that are based on the same principle: encode the “inputQ’ using an error- 
correcting code, select a random (or pseudorandom) subset of the bits of 
the encoding, and output the encoded codeword restricted to such bits. 

This general approach is common to constructions of very different com- 
binatorial objects, and somewhat different strategies are used to analyse 
such different constructions. 

An early application of the encode-and-project paradigm is in a paper 
of Miltersen IMil98| . applied to the construction of a family of hash 
functions with low collision probability. 

Suppose we want to construct a family of hash functions h : (0, 1}" — > 

{0, l} m , and suppose that we have an error-correcting code C : (0, l} n — > 

{0, 1 }" whose minimum distance is, say, h/3. Let us introduce the fol- 
lowing notation: if y = (j/i, ■ • • , Vk) £ {0, l} fc and S = {si, . . . , si] C [k], 
with si < s 2 < • • • < si then y\ S = (y ai ,y S2 ,. . . ,y ai ) £ {0, 1}'. Then we 
can define a family of hash functions where each function of our family 
is indexed by a subset S C [h] of size m, and hs(x) = C(x)\s- 
It is immediate to see that the collision probability is at most (2/3) m . 

The advantage of this construction is that both the encoding and the 
projection can be evaluated in constant time on a unit-cost RAM (see 
IMII981 ). so that the hash functions in the family can be evaluated in 
constant time. 

One can see a very similar construction at work in the pseudorandom 
generator construction in ISTVOlj and in the randomness extractor con- 
struction in [Tre99ll . with some fundamental difference. One difference 
is that the “projection” is not chosen uniformly at random, but it is 
rather generated from a seed of logarithmic length using the “combina- 
torial design construction” of Nisan and Wigderson INW941 . Another 
difference is that, in the case of pseudorandom generators, it is not 
enough for the error-correcting code to have large minimum distance, 
but a certain type of sublinear-time list-decoding algorithm must also 
exist I STVOll ; in the case of random extractors, howver, any code with 

* Work supported by a Sloan Research Fellowship and an NSF Career Award. 

1 By “input” we mean the actual input for hash functions, the weakly random input 
for randomness extractors, and the description of a computationally hard problem 
for pseudorandom generators and for hitting set generators. 
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a relative minimum distance close to 1/2 can be used Tre991 . Con- 
structions of pseudorandom generators and/or randomness extractors 
in IRRV 9911b W00ITSUZ01I use error-correcting codes and the Nisan- 
Wigderson combinatorial designs, with improvements in the construc- 
tion, in the analysis, and in the composition of the basic construction 
with other tools (and with itself). 

The Nisan-Wigderson approach yields a randomness-efficient but some- 
what counter-intuitive way of generating projections. When the input (or 
hard problem) is encoded as a multivariate polynomial (which is done 
in STVOl i and is a possible implementation of ITre991 ). a more natural 
approach to projection is to consider lines. Miltersen and Vinodchan- 
dran 1MV99I show that by encoding a hard problem as a multivariate 
polynomial, and then restricting it to axis-parallel lines, one can get a 
hitting set generator construction, which in turn can be used to deran- 
domize complexity classes. The approach of IMV99I does not replicate 
the result of [IW97 (I MV99 I can prove P = BPP only under a stronger 
assumption than the one postulated in 1IW971 ). however it can prove a 
result on AM that is stronger than the best known result based on Nisan- 
Wigderson !KvM99| . The analysis in IMV991 appears to be substantially 
different from the analysis in [STVOli , although the hard function is en- 
coded using the same error-correcting code, and the “only” difference is 
in the way the encoding is projected (lines versus the approach based on 
Nisan-Wigderson) . 

Ta-Shma, Zuckerman and Safra 1TSZS01] showed how to construct ran- 
domness extractors by encoding the input using multivariate polynomials 
and then restricting it to a subset of a line (the line is selected using the 
seed of the extractor). While the construction of ITSZSOI is virtually 
identical to the one in |MV99i , the analysis is completely different. 
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We survey recent works which unify four of the most important and widely 
studied objects in the theory of pseudorandomness, namely: pseudorandom gen- 
erators, expander graphs, error- correcting codes , and extractors. Since all of these 
objects are “pseudorandom” in some sense, it is not surprising that construc- 
tions of one of these objects may be useful in constructing another. Indeed, there 
are many examples of this in the past, which we do not attempt to summarize 
here. Instead, we focus on how recent works show that some of these objects 
are almost the same, when interpreted appropriately. All of the connections we 
describe involve extractors, lending further credence to Nisan’s assertion that 
they “exhibit some of the most ‘random-like’ properties of explicitly constructed 
combinatorial structures” N:s!)(ij . 

Extractors and Pseudorandom Generators. In 1999, Trevisan |Tre99| discovered 
a completely unexpected connection, showing that every “black-box” construc- 
tion of pseudorandom generators from hard Boolean functions is also a construc- 
tion of extractors. This unified two previously distinct lines of research, opened 
the door to many improved extractor constructions i'Tre99, RRV99, ISWOO 
rruzoii . and provided new ways of thinking about both kinds of objects. 

Extractors and Expander Graphs. It has been known since the defining paper of 
Nisan and Zuckerman [NZ96] that an extractor can be viewed as a certain kind of 
expander graph, and indeed this was exploited in [WZ99] to construct expanders 
that “beat the eigenvalue bound.” Recently, this connection between extractors 
and expander graphs has been exploited more fully, yielding solutions to two 
classic problems about expander graphs: 1. a simple, iterative construction of 
constant-degree expanders jRVW00| . and 2. graphs with expansion greater than 
1/2 times the degree [ TUZ01I . 

Extractors and Error- Correcting Codes. Several recent constructions of extrac- 
tors (such as Trevisan’s construction, its successors, and jRSWOOj l make im- 
portant use of error-correcting codes. This is no coincidence, as Ta-Shma and 
Zuckerman [ITZOlj have shown that extractors can be viewed as a generaliza- 
tion of list-decodable error-correcting codes. This viewpoint has led to new ex- 
tractor constructions, which more directly exploit the properties of algebraic 
codes [TZSOlj . 
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Abstract. We study integrated prefetching and caching in single and 
parallel disk systems. A recent approach used linear programming to 
solve the problem. We show that integrated prefetching and caching can 
also be formulated as a min-cost multicommodity flow problem and, ex- 
ploiting special properties of our network, can be solved using combinato- 
rial techniques. Moreover, for parallel disk systems, we develop improved 
approximation algorithms, trading performance guarantee for running 
time. If the number of disks is constant, we achieve a 2-approximation. 



1 Introduction 



In today’s computer systems there is a large gap between processor speeds and 
memory access times, the latter usually being the limiting factor in the per- 
formance of the overall system. Therefore, computer designers devote a lot of 
attention to building improved memory systems, which typically consist of hard 
disks and associated caches. Caching and prefetching are two very well-known 
techniques for improving the performance of memory systems and, separately, 
have been the subject of extensive studies. Caching strategies try to keep actively 
referenced memory blocks in cache, ignoring the possibility of reducing processor 
stall times by prefetching blocks into cache before their actual reference. On the 
other hand, most of the previous work on prefetching tries to predict the memory 
blocks requested next, not taking into account that blocks must be evicted from 
cache in order to make room for the prefetched blocks. Only recently researchers 
have been working on an integration of both techniques |l[2|3|4|5|7j . 

Cao et al. [3| and Kimbrel and Karlin [7] introduced a theoretical model for 
studying “Integrated Prefetching and Caching” (IPC) that we will also use in this 
paper. We first consider single disk systems. A set S of memory blocks resides 
on one disk. At any time a cache can store k of these blocks. The system must 
serve a request sequence a = <r(l), . . . , cr(n), where each request a(i ), 1 < i < n, 
specifies a memory block. The service of a request takes one time unit and can 
only be accomplished if the requested block is in cache. If a requested block is 
not in cache, it must be fetched from disk, which takes F time units, where 
F £ IN. If a missing block is fetched immediately before its reference, then the 
processor has to stall for F time units. However, a fetch may also overlap with 
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the service of requests. If a fetch is started i time units before the next reference 
to the block, then the processor has to stall for only max{0, F — i) time units. In 
case i > 1, we have a real prefetch. Of course, at most one fetch operation may 
be executed at any time. Once a fetch is initiated, a block must be evicted from 
cache in order to make room for the incoming block. The goal is to minimize the 
processor stall time, or equivalently the elapsed time , which is the sum of the 
processor stall time and the length n of the request sequence. 

In parallel disk systems with D disks we have D sets of memory blocks 
Si, . . . , So, where 5<j is the set of blocks that reside on disk d, 1 < d < D. We 
assume that each block in the system is located on only one of the disks. The 
main advantage of parallel disk systems is that blocks from different disks may 
be fetched in parallel. Thus if the processor has to stall at some point in time, 
then all the fetches currently being active advance towards completion. If a fetch 
is initiated, we may evict any block from cache, which corresponds to the model 
that blocks are read-only. Again the goal is to minimize the processor stall time. 

Cao et al. m studied IPC in single disk systems. They presented simple 
combinatorial algorithms, called conservative and aggressive , that run in polyno- 
mial time and approximate the elapsed time. Conservative achieves an approxi- 
mation factor of 2, whereas aggressive achieves a better factor of min{2, 1+F/k}. 
Karlin and Kimbrel j7] investigated IPC in parallel disk systems and presented a 
polynomial-time algorithm whose approximation guarantee on the elapsed time 
is (1 + DF/k). In [l] Albers, Garg and Leonardi developed a polynomial-time 
algorithm that computes an optimal prefetching/caching schedule for single disk 
systems. For parallel disk systems they developed a polynomial-time algorithm 
that approximates the stall time. The algorithm achieves an approximation fac- 
tor of D , using at most D — 1 extra memory locations in cache. All the results 
presented in fl] are based on a linear program formulation. 

In this paper we show that IPC in single and parallel disk systems can be 
formulated as a min-cost multicommodity flow problem and, exploiting special 
properties of the network, can be solved using combinatorial methods. These 
results are presented in Sec. [3 We first investigate the single disk problem. We 
describe the construction of the network and establish relationships between min- 
cost multicommodity flows and prefetching/caching schedules. We prove that a 
combinatorial approximation algorithm by Kamath et al. |B] for computing min- 
cost multicommodity flows, when applied to our network, computes an optimal 
prefetching/caching schedule in polynomial time. We then generalize our multi- 
commodity flow formulation to parallel disk systems. With minor modifications 
of the original network we are able to apply the algorithm by Kamath et al. [B] 
again. We derive a combinatorial algorithm that achieves a D-approximation on 
the stall time, using at most D — 1 extra memory location in cache. Thus, the 
results presented in [Tj can also be obtained using combinatorial techniques. 

For parallel disk systems, D is the best approximation factor on the stall 
time currently known. This factor D is caused by the fact that the approach 
in [T heavily overestimates the stall times in prefetching/caching schedules: 
Stall time is counted separately on each disk, i.e. no advantage is taken of the 
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fact that prefetches executed in parallel simultaneously benefit from a processor 
stall time. In Sec. [3] we develop improved approximation guarantees that are 
bounded away from D. We are able to formulate a trade-off. For any z € IN, 
we achieve an approximation factor of 2 {D / z) at the expense of a running time 
that grows exponentially with z. If the number D of disks is constant, we ob- 
tain a 2-approximation. For the special case D = 2 we also give a better 1.5- 
approximation. Again, our solutions need D — 1 extra memory locations in cache. 
The improved approximation algorithms can also be obtained using min-cost 
multicommodity flows. However, for the sake of clarity and due to space limita- 
tions we present an LP-formulation in this extended abstract. 



2 Modeling IPC by Network Flows 

We first consider single disk systems. We build up our combinatorial algorithm in 
several steps. Given a request sequence cr, we first construct a network G = (V,E) 
with several commodities such that an integral min-cost flow corresponds to 
an optimal prefetching/caching schedule for a , and vice versa. Of course, an 
algorithm for computing min-cost multicommodity flows does not necessarily 
return an integral flow when applied to our network. We show that a non-integral 
flow corresponds to a fractional prefetching/caching schedule in which we can 
identify an integral schedule using a technique from [Tj. 

The main problem we are faced with is that we know of no combinatorial 
polynomial-time algorithm for computing a (non-integral) min-cost flow in our 
network. We solve this problem by applying a combinatorial approximation algo- 
rithm by Kamath et al. [Sj. For any e > 0, S > 0, the algorithm computes a flow 
such that a fraction of at least 1 — e of each demand in the network is satisfied 
and the cost of the flow is at most (1 + S) times the optimum. Unfortunately, 
the flow computed by the algorithm, when applied to our network, does not 
correspond to a feasible fractional prefetching/caching schedule: It is possible 
that (a) more than one block is fetched from disk at any time and (b) blocks are 
not completely in cache when requested. We first reduce the flow in the network 
to resolve (a). This reduces the extent to which blocks are in cache at the time 
of their request even further. We then show that, given such flow, we can still 
derive an optimal prefetching/caching schedule, provided that e and S are chosen 
properly. 

2.1 The Network 

Let a be a request sequence consisting of n requests. We construct a network 
G = (U, E) with n + 1 commodities. Associated with each request <r(i) is a 
commodity i, 1 < i < n. This commodity has a source Sj, a sink tj and demand 
di = 1. Let dj be the block requested by a (i). For each request a(i), we introduce 
vertices Xi and x\. These vertices are linearly linked, i.e. there are edges (xijx'f), 
1 < i < n, and edges {x^, Xi+i), 1 < i < n — 1, each with capacity k and 
cost 0. Intuitively, this sequence of vertices and edges represents the cache. If 



Minimizing Stall Times in Disk Systems Using Multicommodity Flows 



15 




Fig. 1 . Sketch of the network for request sequence abcbc and F = 2 



commodity i flows through then block is in cache when er(j) is served. 

To ensure that at is in cache when a(i) is served, we insert an edge (a ;(,tj) with 
capacity 1 and cost 0, and there are no other edges into ti or i. e. commodity 
i must pass through (xi,x'). 

Let pi be the time of the previous request to Gq, i. e. pt is the largest j, j < i, 
such that a,i was requested by a(j). If at is requested for the first time in er, then 
we set Pi = 0. To serve er (i), block cq can (1) remain in cache after cr(p,) until 
request a(i), provided that pi > 0, or can (2) be fetched into cache at some time 
before er(j). To model case (1) we introduce an edge (si,x' p .), if pi > 0, with 
capacity 1 and cost 0. To model case (2) we essentially add edges ( Si,Xj ), for 
j = Pi + 1, . . . , i, indicating that a fetch for ai is initiated starting at the service of 
er(j). For the special case j = i the edge represents a fetch executed immediately 
before a(i). If i — j < F, then the processor has to stall for F — (i — j) time units 
and hence we assign a cost of F — (i — j) to edge ( Si,Xj ). Figure |T] illustrates 
this construction for the examplary request sequence er = abcbc and fetch time 
F = 2. Edges outgoing of a source Si, i G {1, . . . , 5}, are labeled with their cost. 

So far our construction allows a flow algorithm to saturate more than one of 
the edges that correspond to fetches executed simultaneously (consider, e. g. the 
edges (si,Xi-\) and (si-i, Xi- 2 ) for some i such that a(i — 2), a(i — 1) and a(i) 
are pairwise distinct). However, we have to make sure that at most one fetch 
operation is executed at any time. Therefore, in our construction we split the 
“super edge” ( Si,Xj ) into several parts. For any £, 1 < t < n — 1, let [£,£+ 1) be 
the time interval starting at the service of a(£) and ending immediately before 
the service of a(£ + 1). Interval [0,1) is the time before the service of <r(l). 

We have to consider all the fetches being active at some time in [£,£ + 1), 
for any fixed i. A fetch for ai starting at <r(j), j < i, is active during [£,(. + 1) 
for £ = j , . . . , m in {j + F,i} — 1. For any fixed i and j with 1 < i < n and 
Pi + 1 < j < i we introduce vertices uf ? - and wfj where £ = j , . . . , minjy + F, i} — l. 
These vertices are linked by edges of capacity 1 and cost 0. More specifically, 
we have edges £ = j, . . . , min{j + F, i} - 1, and edges 

t = j + 1, . . . , min{j + F. i} — 1. The last vertex in this sequence, w L, is linked to 
Xj with an edge of cost 0 and capacity 1. Finally we add an edge (sj, vfj ) , where 
£ = min{j + F, i} — 1, to the first vertex in this sequence with cost F — ( i—j ) 
and capacity 1. In this construction we excluded the case j = i because a fetch 
for ai initiated at a(i) is somewhat special: The fetch is performed completely 
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before <r(i), i. e. it does not overlap with any request, and the processor stalls 
for F time units. The fetch is active at some time during [i — 1, i). We introduce 
vertices v V 1 and wff 1 linked by an edge of capacity 1 and cost 0. Vertex w*” 1 
is linked to Xi with an edge of the same capacity and cost. Finally, we have an 
edge {si,vlr ) of capacity 1 and cost F. 

Next we describe the role of the (n+l)-st commodity, which is used to ensure 
that no two prefetches are performed at the same time. More precisely, we ensure 
that at most one prefetch is executed in any fixed interval [£,£+1), 1 < £ < n — 1. 
For any fixed £, let fe be the number of prefetches whose execution overlaps with 
[£,£+ 1), i. e. 

fe = \{vlj I 1 < i < n, Pi + 1 < j < i}| . (1) 

Commodity n + 1 has a source s„+i, a sink t n + 1 and a demand of d n + i = 
J2e=i(ft ~ -*-)• The fl° w f rom s n + 1 to t n+ \ is routed through the edges 
and newly introduced “subsinks” t^ +1 , 1 < £ < n — 1. For any pair of vertices 
vfj and wfj we introduce edges (s n+ i,vfj) and (wfj, t£ +1 ) with capacity 1 and 
cost 0. Additionally, we insert edges (t% l+1 ,t n+ 1 ) with capacity fe — 1 and cost 
0. Now consider a fixed interval [£, £+ 1), 1 < £ < n — 1. Every prefetch for some 
a, initiated at a(j) that is active at some time during [£, £ + 1) is represented by 
a “super edge” ( Si,Xj ) and contains an edge For fixed £ the network 

contains fe such edges. The capacities fe — 1 of the edges (t^ +1 ,t n+ i) ensure 
that only one of the edges can carry a flow of commodity i, i < n. 

If two or more such edges were carrying flow of commodity i < n, then the 
capacity constraint would be violated at some edge (f£ +1 , t n+ i), for some £' ^ £, 
or demand d n+ \ would not be satisfied. 

The following lemma states that our network correctly models IPC on a single 
disk. Its proof is omitted in this extended abstract. 

Lemma 1. Any feasible integral flow of cost C in G corresponds to a feasible 
prefetching/caching schedule with stall time C for a, and vice versa. 

2.2 Properties of Optimal Flows 

We show that a non-integral flow in our network corresponds to a fractional 
prefetching/caching schedule, defined in the following way. 

Definition 2. Given an instance of the problem IPC, we define the set of frac- 
tional solutions as a superset of the set of integral solutions to the instance. A 
fractional solution may deviate from an integral solution in the following way: 

— The amount to which a block resides in cache may take a fractional value 
between 0 and 1. However, this amount must be 1 while the block is requested. 

— Fractional parts of blocks in cache arise due to partial evictions or partial 
fetches. For each time interval, the net amount of blocks fetched must not be 
larger than the net amount of blocks evicted, and the net amount of blocks 
fetched must not exceed 1. 
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— Stall times are accounted as follows: If a fetch to 5 £ [0, 1] units of block a(j) 
is initiated starting at the service of reference a(i) and j — i < F holds, we 
incur a stall time of 6(F — (j — i)) time units. 

Loosely speaking, the main difference between integral and fractional solutions 
lies in the possibility to interrupt fetches and to leave parts of a block in cache 
between consecutive requests to it. Regarding the second item in the above 
definition we may assume w. 1. o. g. that between any two consecutive references 
to a specific block, the points of time where the block is evicted from cache 
precede the ones where the block is fetched back. 

Lemma 3. Let G be the network obtained by transforming a request sequence 
a of the problem IPC according to the construction in Sec. \2.1\ A valid multi- 
commodity flow with cost C within the network G corresponds to a fractional 
pref etching/ caching schedule with stall time C . 

The next lemma follows immeditately from (T] ; it was shown that a fractional 
solution is a convex combination of polynomially many integral solutions. 

Lemma 4. Let L be a fractional solution to an input for IPC. There is a 
polynomial-time algorithm that computes an integral prefetching/caching sched- 
ule L* from L where the stall time of L* is less than or equal to the one of L. 



2.3 Applying the Approximation Algorithm 

We show how to compute a flow in our network and how to derive an optimal 
prefetching/caching schedule. We apply the algorithm by Kamath et al. Eby 
setting e := 1/(4 F 2 n 3 ) and S := 1/(3 nF). These settings have been derived from 
the easy-to-see upper bound d n +i < n 2 F on the demand of commodity n + 1. 

As the approximation algorithm only satisfies a fraction of 1 — e of each 
commodity, the flow out of each source Sj, * £ {1, . . . , n}, is lower bounded by 1 — 
e. Moreover, commodity n+l might lack an amount of ed n+ 1 < eFn 2 . We assume 
pessimistically that this leads to an additional “illegal” flow with value eFn 2 
during a time interval [1,1+1), I £ {1, . . . , n — 1}, in so far as edges representing 
fetches in that interval are not “congested” properly by commodity n + l. 

Let q := 1 — Ed n +i — e be a crucial lower bound on the flow of commodities 
1 ,... ,n. We can transform the flow (f> output by the approximation algorithm 
into a uniform flow <f>' which directs exactly p units of flow from s, to ij for 
any commodity i £ {1, . . . ,n}. The main idea is to reduce, for each edge, the 
flow of commodity i proportionally to the relative amount of flow of commodity 
i on the considered edge. Then we end up with a uniform flow <// which does 
not “overflow” any interval [£, i + 1) and delivers the same amount for each 
commodity. 

In view of Definition [2] and the equivalence described in Lemma [3] the flow <j>' 
corresponds to a fractional solution to IPC in which all blocks have size g. Flow 
<f>' corresponds to a fractional solution to IPC in which all blocks have size g and 
the number of cache slots is upper bounded by k/ g — hereinafter we call such a 
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solution a g-solution. According to Lemma |4] we may interpret the fractional 
solution which corresponds to <j>' as a convex combination of integral p-solutions. 

In order to analyze the quality of the above-described convex combination of 
integral ^-solutions, we have to establish a lower bound on g. As d n + ± < Fn 2 , 
we obtain g > 1 — ed n + 1 — e > 1 — 2sd n +\ > 1 — l/2nF. Next we estimate 
the cost C of the convex combination of ^-solutions. Since the approximation 
algorithm outputs flows with cost at most (1 + 6) OPT, where OPT is the cost 
of an optimal schedule, and reducing flows to g does never increase cost, the 
following upper bound on C holds: 

C < (1 + 5) OPT = OPT /(3 nF) + OPT < OPT +1/3 since OPT < nF . 

(2) 

We underestimated the cost C of the convex combination of ^-solutions by 
an additive term of at most n( 1 — g)F. This is due to the fact that each block 
corresponding to a specific commodity has size 1 in reality, but size g in the 
convex combination. By increasing the block size (or, equivalently, the flow of 
the corresponding commodity), the cost can rise by at most (1 — g)F. Hence, 
the cost C' of the convex combination of integral solutions is at most 

C' <C + n( 1 - g)F < OPT +1/3 + nF/2riF < OPT +1 . (3) 

From C < OPT +1 we conclude that the convex combination contains at 
least one integral solution with optimal costs. As the number of possible integral 
solutions is bounded by Fn 2 (see rax an optimal component, i.e. integral solu- 
tion, within the convex composition can be computed in polynomial time. How- 
ever, each integral solution originates from a ^solution where a block has size 
g. Since the cache is still k large, it remains to prove that no integral component 
of the convex composition does hold more than k blocks in cache concurrently. 

Since, w.l.o.g., k/(nF) < 1 holds, the number of blocks of size g held con- 
currently in cache is at most 

-^^ 5- k ( 1 + ^) <h+1 (4) 

because (1— e') -1 < l+2e' for any e' £ [0, 1/2]. Therefore, (/c+l)g > k holds, and 
we would obtain a contradiction if an integral solution held more than k pages in 
cache concurrently. Finally, this implies that we have found a feasible and optimal 
prefetching/caching schedule. The overall running time of the approximation 
algorithm is 0*(£~ 3 S~ 3 c\E\\V\ 2 ), where c denotes the number of commodities 
and O* means “up to logarithmic factors”. As |V| = 0(n 2 ) and \E\ = 0(n 2 ), we 
obtain the polynomial upper bound O* {{nF) 3 {n 3 F 2 ) 3 {n + l)n 2 n 4 ) = 0*(n 18 ). 
Now we state the main result of this section. 

Theorem 5. An optimal solution to an input for IPC can be computed by a 
combinatorial algorithm in polynomial time. 
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2.4 Generalization to Multiple Disks 

The solution developed for single disk systems can be generalized to multiple 
disks. Due to space limitations, we only state the main result here. 

Theorem 6. There is a combinatorial polynomial-time algorithm which com- 
putes a D- approximation to an input for IPC if the number of disks is D and 
there are D — 1 slots of extra cache available. 

3 Improving the Approximation Factor 

In this section we return to the linear program by Albers, Garg and Leonardi jT] 
for the multiple disk case and improve its approximation factor. If the number D 
of disks is constant, we achieve a 2-approximation. We know that our approach 
leads to a linear program which can also be stated as a min-cost multicom- 
modity flow problem. We omit that representation as we consider the improved 
approximation guarantee to be the most important contribution. 

3.1 Bundling Intervals 

The drawback of the LP formulation by Albers, Garg and Leonardi l] is that 
it overestimates the stall time of prefetching/caching schedules. We present an 
LP that counts stall time more accurately. As in [il] we represent time periods 
in which fetch operations are executed by open intervals I = (i,j), with i = 
0 , . . . , n — 1 and j = i + 1 , . . . , n, where n = |cr| is the length of the given 
request sequence. Such an interval I = ( i,j ) corresponds to the time period 
starting after the service of cr(i) and ending before the service of a(j). Its length 
is |/| = j — i — 1. If 1 1\ < F, then F — |/| units of stall time must be scheduled in 
the fetch operation. Since fetches take F time units, we can restrict ourselves to 
intervals with j < i + F + 1. For each potential interval / we introduce a copy I d 
for each disk d £ {1, . . . , D}. Let I be the resulting set of all these intervals. The 
LP in PQ determines which intervals of I should execute prefetches. Stall times 
are counted separately for the intervals and disks, which causes the overestimate. 

The main idea of our LP is to form bundles of intervals and treat each 
bundle as a unit: In any bundle either all the intervals or no interval will execute 
a fetch. We next introduce the notion of bundles and need one property of 
optimal prefetching/caching schedules. An interval / = (T1G2) properly contains 
interval J = (ji , j 2 ) (which is not necessarily associated with the same disk) if 
*1 < ji and J2 < ii hold. The proof of the next lemma is omitted. 

Lemma 7. An optimal (fractional or integral) prefetching/ caching schedule for 
a system with D disks does not include fetch intervals properly containing each 
other. 

Definition 8. A set of intervals B, \B\ 7^ 0, is called a bundle if B contains 
at most one interval from each disk and is overlapping. A set of intervals B is 
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called overlapping if it includes no intervals properly containing each other but 
has for all but one I = (*i, * 2 ) G B some interval J = (ji, j’ 2 ) € B, J ^ I, such 
that j 1 > i\ and J overlaps with I. Two intervals I = (ii , * 2 ) and J = (ji,j 2 )» 
ii < j 1 , are called overlapping if either j 1 < *2 — 1 is vaZzd, or j\ = ii — 1 and 
additionally fa — ii — 1 < F hold. 

Fix a 2 £ IN with z < D. We will bundle intervals from up to z disks. In this 
extended abstract we assume for simplicity that D / z £ IN. We partition the disk 
set into D/z sets {1, ... , 2 }, {2 + 1, , 2 z}, . . . , {D — z + 1, ... , D}. Now let B z 
be the set of all the bundles composed of intervals from X, with the additional 
restriction that the intervals of a bundle must come from the same subset of the 
disk partition. One can show that \B Z \ < n(F + 1 ) 2z {D / z)z\. 

We are nearly ready to state the extended linear program for IPC and D > 1. 
For each bundle B £ B z , we introduce a variable x(B) which is set to 1 if a 
prefetch is performed in all intervals in bundle B , and is set to 0 otherwise. In 
order to specify which blocks are fetched and evicted we use variables fjd a and 
ejd a for all I d £ T and all blocks a. Variable fjd a (respectively e/d i0 ) is equal to 
1 if a is fetched (respectively evicted) in I d . Of course e/d a = fjd a = 0 if a does 
not reside on disk d. For a bundle B £ B z , let s(B) be the minimum stall time 
needed to execute fetches in all the intervals of B assuming that no other fetch 
operations are performed in the schedule. The value s(B) can be computed as 
follows. Let ( 01 , bi ), . . . , (a m , b m ) be the sequence of all intervals in B obtained 
by sorting them by increasing end index, where intervals with the same end index 
are sorted by increasing start index breaking ties arbitrarily. One can easily verify 
that in an optimal schedule for B , stall times occur at the end of intervals, the 
fetch in (ai,&i) is started at the latest point in time (i.e. immediately before 
request <22 if M yf cq and after cq otherwise) whereas the fetches in (a,, bf), i > 2 , 
are started at the earliest point in time. We determine the amounts of stall times 
needed at the end of intervals. Let fa, fa, ■ ■ ■ ,W with m' < m be the sequence 
obtained from 61 , . . . , b m by eliminating multiple occurrences of the same value 
and keeping only the indices ij such that b,. + j y^ 6 ^.. By definition, fa := 0 
and bi 0 := 0. For j = l,...,m / , interval (oj., 6 ,.) is the shortest interval with 
end index b,_ and determines the stall time to be inserted before that request. 
The function h : { b 2l , . . . , bi , } — > IN that indicates the actual stall time needed 
before request b tj is defined inductively, for j = 1 , . . . , m! , as follows: 

MM := max jo, F — (b ij - a i} - 1) - ^ Mr) j • 

re{ 6 < 1 ,..., 6 ^_ 1 }: re{o 4 j +l,... , 6 ^- 1 } 

Using this definition we have s(B) := Xq=i MM,)' 

In order to refer to individial disks, we need for d £ {1, . . . , D} the projections 
nd'- B z -» 1 , where nd{B) = I if I £ B and / resides on disk d, and TTd{B) = 0 
if B contains no interval associated with disk d. The value of n d is well defined 
since at most one interval from each disk is part of a bundle. Now the extended 
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linear program reads as follows. Minimize the objective function 

E x (B)s(B) (5) 

bgb z 

subject to 



V* e {1, . . . , n}, Vd 


E <B) < 1 


(6) 
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(12) 



Here we have taken over some terminology from the original formulation in |I]. 
The first set of contraints ensures that for each disk and each point of time, the 
amount of fetch is at most 1. The second set of constraints guarantees for each 
interval on every disk that the amount of blocks fetched in the interval is at most 
the overall amount of blocks evicted in that interval. For a specific interval I d , 
we allow a bundle variable x(B ), where B contains / d , to take value 1; observe 
that B might consist of I d as the only element. If a bundle variable x(B) is 1, the 
second set of constraints allows fetches in all intervals belonging to the bundle. 
Please note that constraint © only ensures that at most one prefetch operation 
may be executed while serving a request. Especially, it allows prefetches to be 
started in the midst of stall times, such the exact point of time where a prefetch 
is started may be unspecified if there is stall time at the beginning of an interval. 
We will argue later that this freedom is justified. Constraints (a m have been 
adapted from the LP formulation in [Tj and ensure that a block is in cache at the 
time of its reference. The objective function finally counts the s-values, which 
are related to parallel stall times, for bundles whose variables are 1. It remains 
to prove that a solution to the extended linear program induces a valid schedule 
whose stall time is counted at least once by the value of the objective function. 

Consider an arbitrary integer solution to the extended LP, which specifies 
an assignment to the variables //, a ,e/, a and x(B). Using fj a and e/, a , we know 
between which requests a prefetch operation must be started, but may choose 
the exact point of time of the start if the related requests are intermitted by stall 
time. We inductively construct a schedule whose stall time is bounded from above 
by the value of the objective function. First, we sort the bundles B for which 
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x(B ) = 1 holds by increasing maximum end index (of the intervals in the bundle) 
and, if equality holds, by increasing minimum start index. Let B\, . . . ,B m be 
the resulting sequence. Suppose that we have already constructed a schedule 
for Bi, , B r _ i- For bundle B r , we have to schedule fetches and evictions for 
those blocks whose variables // i0 and e/ ja have been set to 1 and / £ B r . We 
use the notation introduced for defining the stall times s(B) on page [20] Let 
(ai, &i), . . . , (a m , b m ) be the sequence of intervals in B r . We first insert h(b ie ) 
units of stall time before 6^, i £ {1, . . . ,m'}, and then schedule the fetches in 
( a.i,bi ), i £ {1, as follows. The fetch in (ai,6i) is started at the latest 

possible point in time. More precisely, if ai < b\, we start the fetch with the 
service of request cii + 1; otherwise the fetch is scheduled immediately before the 
service of b\. The fetches in (a,;. b, ), i > 2, are started at the earliest point in time 
after such that the required disk is available. The definition of the h - values 
ensures that we reserve at least F time units for each fetch irrespectively of stall 
times which are caused by fetches in intervals from bundles B \, . . . , B r -\. In 
fact, a reserved time interval might even be longer. However, this is no problem. 
The fetch simply completes after F time units and the corresponding disk is 
then idle for the rest of the interval. Thus, the constructed schedule is feasible 
and we insert exactly s(-Bj) units of stall time. 

3.2 Achieving the (2D / z )- Approximation 

Lemma 9. The extended linear program for D disks has an integral solution of 
cost at most (2D/z) OPT. 

Proof. Suppose we have been given an optimal integral prefetching/caching 
schedule of stall time OPT. We restrict ourselves to an arbitrary subset of the 
partition 



of the disk set {1, . . . , D}. W. 1. o. g., this subset is {1, . . . , z}. We consider only 
stall times caused by fetches in intervals associated with disks {1 In 

the following, we specify an assignment to the variables associated with disks 
{1 ,... ,z} such that the stall time that arises by executing only the fetches on 
disks {1, . . . , zj is counted at most twice in the objective funtion of the linear 
program. Repeating this process for all the subsets of the above partition, we 
obtain an assignment to all the variables x(B ) for B £ B z . As the objective 
function is separable with respect to the bundles in B z and therefore with respect 
to the ( D / z ) subsets of the above partition, we count a specific stall time in the 
optimal prefetching/caching schedule at most 2 (D/z) times. 

By I' C I we denote the set of all intervals associated with the disk set 
{1, . . . , z} in which prefetches are performed. According to Lemma 0 we have 
no intervals properly containing each other in the set T' . Therefore, we can order 
the intervals in T' by increasing start points and (if these are equal) by increasing 
end points. Let I \, . . . , I m be the resulting sequence. We partition I' into bundles 
according to the following greedy algorithm. 



D/z-l 




(13) 
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H := 0 

for j = 1, . . . , m do 

if interval Ij U B is a bundle then set B := I 7 U B 

else output B as an element of the partition and set B := 0. 

Let Bi U • • • U Be, £ < m, be the partition of I' obtained by this process. 
Our solution to the linear program is constructed by setting x(Bj ) to 1 for 
j £ {1, . . . ,£}. The variables //, Q and e/ ja are set according to which blocks are 
fetched in the intervals of the considered bundle. This process is repeated for 
each subset of the partition m of the disk set. All remaining variables are zero. 

In the following, we revert to the subset {1, ... ,z} and denote by OPT* 
the stall time incurred if counting only fetch intervals associated with disks 
z}. For bundles Bj, j £ {1, . . . , £}, let s'(Bj) be the sum of the stall 
times in the optimal schedule between the start of the first fetch in Bj and the 
completion of the last fetch in Bj. Obviously, s(Bj) < s'(Bj). One can show that 
eU s '^ < 2 • OPT*, which implies that the value of the objective function 
is at most 2 OPT. □ 

Lemma 0 implies that there is also a fractional solution to the extended LP with 
cost at most (2 D/z) OPT. Using techniques from [T], we can convert a fractional 
solution to the extended LP with cost C to an integral prefetching/caching 
schedule with stall time C if D — 1 extra memory locations in cache are available. 
Thus we obtain an approximation algorithm of factor 2 D/z. 

Theorem 10. There is a polynomial p{n) such that for each z £ {1, 
with D/z £ IN there is a algorithm with running time 0{p{n) • n • (F + 1) Z H) 
that computes a 2D / z- approximation for IPC provided that D — 1 extra memory 
locations are available in cache. 

Finally, for D = 2 the extended linear program does not seem to give an 
improved approximation. However, in this special case, we can show an even 
better approximation factor of 1.5. 

Lemma 11. If D = 2, the extended linear program with z = 2 has an integral 
solution of cost at most 1.5 • OPT. 
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Abstract. We discuss two approximation approaches, the primal-dual 
schema and the local-ratio technique. We present two relatively simple 
frameworks, one for each approach, which extend known frameworks for 
covering problems. We show that the two are equivalent, and conclude 
that the integrality gap of an integer program serves as a bound to the 
approximation ratio when working with the local-ratio technique. 



1 Introduction 

The primal-dual method for solving linear programs was originally proposed by 
Dantzig, Ford, and Fulkerson m for solving linear programs. Over the years this 
method has become an important tool for solving combinatorial optimization 
problems. Many algorithms which solve combinatorial optimization problems 
such as Dijkstra’s shortest path algorithm El, Ford and Fulkerson’s network 
flow (or minimum s , t-cut) algorithm |13j , and Kuhn’s assignment algorithm I20| 
either use the method or can be analyzed by it (see, e.g., [15] for more details). 

Combinatorial optimization problems can often be formulated as integer 
programs. However, unlike the above mentioned problems, the LP-relaxation 
of many problems have non-integral optimal solutions (and an integrality gap 
which is greater than 1), therefore, the primal-dual method cannot be used to 
solve them. The primal dual schema for approximation is a modified version 
of the primal-dual method (see El)- While both complementary slackness con- 
ditions are imposed in the classical setting, we impose the primal conditions 
but relax the dual conditions when working with the primal-dual schema. A 
primal-dual approximation algorithm constructs an approximate primal solution 
and a feasible dual solution simultaneously. The approximation ratio is derived 
from comparing the values of both solutions. Many LP-based algorithms can 
be analyzed by the primal-dual schema. However, the first algorithm using this 
schema is Bar- Yehuda and Even’s jj] approximation algorithm for the vertex 
cover problem (and the set cover problem). Following the work of Goemans and 
Williamson iia, Bertsimas and Teo [8] proposed a primal-dual framework to de- 
sign and analyze approximation algorithms for integer programming problems of 
the covering type. As in [14] this framework enforces the primal complementary 
slackness conditions while relaxing the dual conditions. A detailed survey on the 
primal dual schema was written by Goemans and Williamson m- 
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Recently, Jain and Vazirani [19] presented a primal-dual approximation al- 
gorithm which uses a different approach from the one described here. Instead of 
the “usual” mechanism of relaxing the complementary slackness conditions, they 
relaxed the primal conditions. As far as we know, this is the only primal-dual 
algorithm which works with programs that have negative coefficients. 

The local-ratio approach was developed by Bar- Yehuda and Even [5] in or- 
der to approximate the weighted vertex cover problem. Ten years later Bafna, 
et al. [Tj extended the Local-Ratio Theorem from j5J by introducing the idea of 
minimal covers in order to construct a 2-approximation algorithm for the feed- 
back vertex set problem. Later, Bar-Yehuda [3j presented a unified local-ratio 
approach for developing and analyzing approximation algorithms for covering 
problems. He further extended the Bafna et al. [I] extension of the Local-Ratio 
Theorem. His extension yields a generic r-approximation algorithm, which can 
explain most known optimization and approximation algorithms for covering 
problems. Lately, Bar-Noy et al. [2J have presented an approximation framework 
for solving resource allocation and scheduling problems. Their algorithms are 
based on the local-ratio technique, but can be interpreted within the primal- 
dual schema. This study was the first to present a primal-dual or a local-ratio 
approximation algorithm for a natural maximization problem. 

We present a local-ratio framework which generalizes the one from ESI , in 
which only non-negative weight subtractions were considered. We found this 
limitation to be redundant, and, therefore, our generic algorithm can be used 
to design and analyze a wider family of algorithms. We also present a primal- 
dual approximation framework which extends the generic algorithm from [8|. 
As done before, our framework relaxes the dual complementary slackness con- 
ditions. However, the use of valid inequalities with non-negative coefficients en- 
ables us to widen the variety of algorithms which fall with in the scope of our 
framework. Furthermore, due to [2] , we are able to extend our frameworks to 
handle maximization problems. (This extension will appear in the full version.) 
By showing that primal-dual’s valid inequalities and local-ratio’s weight func- 
tions are equivalent notions we show that both frameworks are actually one and 
the same. A corollary to this equivalence is that the integrality gap of an integer 
program serves as a bound to the approximation ratio of a local-ratio algorithm 
(as in the case of a primal-dual algorithm). Also, note that in cases where a 
primal-dual algorithm serves as a subroutine (e.g., multyphase primal-dual E2B, 
see also m it can be replaced by the corresponding local-ratio algorithm. We 
demonstrate the combined approach on a variety of problems. First, we present a 
linear time approximation algorithm for the generalized hitting set problem. This 
algorithm achieves a ratio of 2 in the special case of the generalized vert-ex cover 
problem, and is better in time complexity than Hochbaum’s fTB| 0(nmlog — ) 
2-approximation algorithm for this special case. We exhibit the first local-ratio al- 
gorithm which uses negative weights by presenting an analysis to Ford and Fulk- 
erson’s m algorithm for minimum s, t-cut. Furthermore, in order to explain this 
algorithm within the primal-dual schema, we use a linear program with inequal- 
ities which have some negative coefficients. We also analyze 2-approximation 
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algorithms for the minimum feedback vertex set problem |7j, and for a non- 
covering problem called the minimum 2-satisfiability problem iee ]. Finally, we 
explain a ^-approximation algorithm for a maximization problem called the in- 
terval scheduling problem [2]. This example provides some insight to the design 
and analysis of approximation algorithms for maximization problems, and to the 
equivalence between the two approaches in the maximization case. 

We believe that this study contributes to the understanding of both ap- 
proaches, and, especially, that it will help in the design of algorithms for max- 
imization problems. Another point of interest is that while the use of a linear 
objective function and linear advancement steps are an integral part of both 
techniques, our frameworks are not limited to linear integer programs (though 
we did not find a natural non-linear application) . 

2 Definitions 

We consider the following problem: Given a non-negative penalty (or profit) 
vector p £ M ra , find a solution x that minimizes (or maximizes) the inner product 
p ■ x subject to some set T of feasibility constraints (not necessarily linear) on 
x £ {0, l} n . Note that we will sometimes abuse the notation by treating a vector 
x £ {0, l} n as the set of its 1 coordinates, i.e., as {j : Xj = 1}. The correct 
interpretation should be clear from the context. 

A minimization (maximization) problem (lF,p) is called a covering ( packing ) 
problem if T is monotone, i.e., for all x' such that x' > x ( x ' < x) if x satisfies 
T then x' satisfies T . Note that a monotone set of linear constraints typically 
includes inequalities with non-negative coefficients. 

Primal-dual approximation algorithms for covering problems traditionally 
reduce the size of the problem at hand in each iteration by adding an element 
whose corresponding dual constraint is tight to the primal solution (see □Ml)- 
Local-ratio algorithms for covering problems implicitly add all zero penalty el- 
ements to the solution, and, therefore, reduce the size of the problem in each 
step as well (see 0)- In order to implement this we alter the problem definition 
by adding a set (or vector), denoted by z, which includes elements which are 
considered to be taken into the solution. This makes it easier to present primal- 
dual algorithms recursively, and to present local-ratio algorithms in which the 
addition of zero penalty elements to the partial solution is explicit. 

More formally, given a minimization (maximization) problem (fF,p) and a 
vector z, we are interested in the following problem: Find a vector x such that (1) 
z PI x = 0; (2) x U z satisfies T (x in the maximization case); And, (3) minimizes 
(maximizes) the inner product p ■ x. (Note that when 2 = 0 we get the original 
problem.) We define the following for a minimization problem (lF,p, z). A vector 
x is called a feasible solution if 2 D x = 0 and 2 U x satisfies T . We denote the set 
of feasible solutions with respect to T and 2 by S(lF, z). A feasible solution x* 
is called an optimal solution if every feasible solution x satisfies p ■ x* < p ■ x. A 
feasible solution x is called an r- approximation if p ■ x < r ■ p ■ x* , where x* is an 
optimal solution. The corresponding definitions for a maximization problem can 
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be understood in a strait-forward manner. Given a minimization (maximization) 
problem (F, p,z), a feasible solution x is called minimal ( maximal ) if for all j £ x 
(j £ x U z) the vector x \ {j} (a; U {j}) is not feasible. 



3 Local-Ratio 



The following is the Local-Ratio Theorem given in our terminology. 

Theorem 1 (|Z|). Let F be a set of constraints, and let z be a boolean vector. 
Also, let p,pi, and p 2 be penalty (or profit) vectors such that p = pi + P 2 - Then, 
if x is an r- approximation with respect to (F,pi,z) and (F,p 2 ,z), then x is an 
r -approximation with respect to (F,p,z). 

Note that F can include arbitrary feasibility constraints and not just linear, 
or linear integer, constraints. Nevertheless, all successful applications of the local 
ratio technique to date involve problems in which the constraints are linear. 

Bar- Yehuda [3J defined the following for covering problems: 

Definition 1. Given a covering problem (F,p), a weight function S is called 
r-effective with respect to F , if every minimal solution x with respect to {F , S) 
satisfies 5x < r ■ Sx* , where x* is an optimal solution. 

We prefer the following more practical definition: 

Definition 2. Given a minimization (maximization) problem (F ,p, *), a weight 
function 5 is called r-effective with respect to (F, z), if\/j £ z, S j =0, and there 
exists (3 such that every minimal (maximal) solution x with respect to (F,z) 
satisfies: (3 < 5 ■ x < r/3 (r(3 < 5 ■ x < (3). In this case we say that (3 is a witness 
to 6 ’s r -effectiveness. If z = 0 we say that S is r-effective with respect to F . 

Obviously, by assigning (3 = Sx*, where x* is an optimal solution, we get that 
the first definition implies the latter. For the other direction, note that (3 < Sx*. 

A local-ratio algorithm for a covering problem works as follows. First, con- 
struct an r-effective weight function S, such that p — 5 > 0 and there exists 
some j for which pj = Sj. Such a weight function is called p-tight. Subtract 6 
from the penalty vector p. Add all zero penalty elements to the partial solution 
2 . Then, recursively solve the problem with respect to (F ,p — S,z). When the 
empty set becomes feasible (or, when z becomes feasible with respect to F) the 
recursion terminates. Finally, remove unnecessary elements from the temporary 
solution by performing a reverse deletion phase. Algorithm LRcov is a modified 
version of the generic local-ratio approximation algorithm for covering problems 
from |3] • (The initial call is LRcov(p, 0).) The main difference between the al- 
gorithm from [31 and the one given here is that in the latter the augmentation 
of the temporary solution is done one element at a time. By doing this we have 
the option not to include zero penalty elements which do not contribute to the 
feasibility of the partial solution z. 
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Algorithm LRcov(p, z) 

1. If 0 <E S{F,z) return 0 

2. Construct a p-tight weight function 8 which is r-effective w.r.t. (T, z ) 

3. Let j $ z be an index for which 5j = Pj 

4. x «— LRcov(p — 5, z U {j}) 

5. If x qL S(fF , z) then x <— x U {j} 

6. Return a: 



Theorem 2 . Algorithm LRcov outputs an r -approximate solution. 

Proof. Let p k ,S k and z k be p, 5, and 2 , respectively, at the end of the /c’th 
level, and let t be the maximal depth of the recursion. We prove by reverse 
induction on k that Algorithm LRcov returns an minimal r-approximation with 
respect to ( tF,p k ,z k ). For k = t, 0 is a minimal optimal solution. Otherwise, 
examine x at the end of the fc’th level. By the induction hypothesis x \ {j} is 
an minimal r-approximation with respect to (J 7 , p k+1 , z k+1 ) . Therefore, £ is a 
minimal solution with respect to (iF, z k ). Moreover, due to the fact that Pj +1 = 0 
x is an r-approximation with respect to (tF,p k+1 , z k ). By the r-effectiveness of 
8 k and Theorem [I] £ is an r-approximate solution with respect to (tF,p k ,z k ). 

Algorithm LRcov finds an r-effective weight function for the current problem, 
adds a zero penalty element to the solution, and then changes the problem. Such 
an algorithm heavily relies upon the fact that the constraint set T is monotone, 
and, therefore, the optimum of {T 1 p k+1 , z k+1 ) is not greater than the optimum 
of (iF,p fc+1 , z k ). Algorithm LRmin uses a slightly different approach. Instead 
of using r-effective weight functions with respect to (lF,z), it uses weight func- 
tions which are r-effective with respect to T , and updates p until a minimal 
zero-penalty solution is found. By using this “subtract first, ask questions later” 
approach we are able to approximate problems by using inequalities with some 
negative coefficients or weight functions with some negative weights. We illus- 
trate this approach in the Sect. E] 



Algorithm LRmin(p) 

1. Search for a feasible minimal solution x C {j : pj = 0} 

2. If such a solution was found return x 

3. Construct a p-tight weight function 8 which is r-effective w.r.t. T 

4. x LRmin(p — (5) 

5. Return x 



Theorem 3. Algorithm LRmin outputs an r-approximate solution. 

4 Primal-Dual 

In [15] Goemans and Williamson presented a generic algorithm based on the 
hitting set problem which is defined as follows: Given subsets Ti, . . . ,T q C E 
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and a non-negative cost c e for every element e £ E, find a minimum-cost subset 
x C E such that xnT, ^ 0 for every i £ {1, . . . , q}. In turns out that many known 
problems are special cases of this problem. The LP-relaxation of its formulation 
as an integer program and the corresponding dual are: 

min J2e£E c e x e max Y.'1-i Vi 

S.t. HeGTi x e> 1 Vi £ {1, . . . , q} S.t. Ei-.eeTi Vi < c e Ve £ E 

x e > 0 Ve £ E yi> 0 Vi £ {1, . . . , q} 

where x e = 1 iff e £ x. The primal complementary slackness conditions are: 
x e > 0 => Ei- e eT- Vi = c o and the dual conditions are: j/j > 0 => \x D Tj\ = 1. 

The algorithm from m starts with the dual solution y = 0 and the non- 
feasible primal solution x = 0. Then it iteratively increases the both solutions 
until the primal solution becomes feasible. As in the primal-dual method, if we 
cannot find a feasible primal solution, then there is a way to increase the dual 
solution. In this case, if x is not feasible there exists a k such that x D = 0. 
Such a set is referred to as violated. In each iteration a violation oracle supplies a 
collection of violated subsets V C {Ti, . . . , T g ]0 Then we increase simultaneously 
and at the same speed the dual variables corresponding to subsets in V. When x 
becomes feasible a reverse delete step is performed. This step discards as many 
elements as possible from the primal solution x without losing feasibility. 

Let x‘ and €j be the set output by the algorithm, and the increase of the dual 
variables corresponding to V 7 in iteration j. Then, y, = Ej t gv e A Ei=i Vi = 

Ej-=ilVjl e j> and c ( xf ) = Eeex f c e = ELi \ xf rT i\yi = EUi \x f r\Ti\ 
Ej-Tiev- e i = Ej=ibE ev |x^nTj|. From this, it is clear that the cost 
of x? is at most the value of the dual solution times r if Vj £ {1, . . . , i} 
Er^Vj \ x * bl Tj < r\Vj\. Examine iteration j of the reverse deletion step. We 
know that when was considered for removal, no element ep with j' < j 
has been already removed. Thus, after ej is considered for removal the tem- 
porary solution is x J ' = x? U {ei, . . . , e^-i}. x is called a minimal augmen- 
tation of {ei, . . . , ej-i}, i.e., is feasible and x- 7 \ {e} is not feasible for all 
e 6 x J \ {ei, . . . , e^-i}. Moreover, Et^v \ x * bl Tj| < Er^ev \ x J There- 

fore, to achieve the above bound Goemans and Williamson have set the following 
requirement: Et ev- \ x ' ^ Ti| < r \Vj\ for every j £ {1, . . . , Z} and for every min- 
imal augmentation x' of {ei, . . . , ej_i}. We formalized this by the following. 

Definition 3. A set V C {Ti, . . . , T q } is called r-effective with respect to (T, z), 
tf Et,gv I x D Tj| < r | V| for every minimal solution with respect to (T 7 , z), x. 

As did Bertsimas and Teo [8| we prefer to speak in terms of inequalities. 

Definition 4. A set of valid inequalities {a 1 ® > ft 1 , . . . ,a k x > /3 fc } (or < in 
the maximization case) is called r-effective with respect to (T,p,z), if a k = 0 
for all k and j £ z, and every integral minimal feasible solution with respect to 



1 We could allow the oracle to return some sets which are not violated. See (15] . 
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z), x, satisfies: Yll=i °? x < r Yi = 1 ft 1 ■ (E the maximization case.) If the 
set contains a single inequality we will refer to this inequality as r-effective. 

An r-effective collection V can be understood as the following r-effective 
set of valid inequalities: {EeeT- x e > 1 : Ti G V}. On the other hand, the latter 
definition allows the use of other kinds of inequalities (e.g., Sect. ©. Thus, our 
goal is to find a set of r-effective valid inequalities in order to increase the dual 
solution. However, it is enough to construct a single r-effective valid inequality for 
that purpose. This is because a set of valid inequalities {cPcc > /?*} satisfies 
Yi= \ a%x < x Ya=i^ 1 by definition. On the other hand, Wi,a z x > (3 l , and, 
therefore, i®* 1 > Ei= ifi 1 - Thus, we have found an r-effective inequality: 

e;u(eLi«v,>eLi/t- 

Bertsimas and Teo [B] presented a primal-dual algorithm for linear covering 
problems which utilizes a single valid inequality in each iteration. This algorithm 
constructs a valid inequality in each iteration, and uses it to modify the current 
instance. After this modification at least one of the coordinates of the penalty 
vector p becomes zero, and this makes it possible to reduce the size of the 
problem in each iteration. Thus, the algorithm terminates after no more than 
n iterations. The performance of this algorithm depends on the choice of the 
inequalities. In fact, it corresponds to what Bertsimas and Teo call the strength 
of an inequality, which is the minimal r for which it is r-effective. 

Algorithm PDcov is a recursive version of the algorithm from [8] . (The initial 
call is PDcov(p, 0, 1).) Informally, it can be viewed as follows: construct an r- 
effective inequality; update the corresponding dual variable and the vector p such 
that p remains non-negative; find an element j whose penalty is zero; add j to 
the temporary partial solution z\ then recursively solve the problem with respect 
to T , 2 : and the new penalty vector; the termination condition of the recursion 
is met when the empty set becomes a feasible solution; finally, a reverse deletion 
phase removes unnecessary elements from the temporary solution. There are two 
differences between the algorithm from |[8] and our algorithm. First, we present 
our algorithm recursively, and, more importantly, our algorithm is not restricted 
to linear integer programs. 

Algorithm PDcov(p, z, k) 

1. If 0 £ S{F,z) return 0 

2. Construct a valid inequality a k x > (3 k which is r-effective w.r.t. (fF, z) 

3. yk <r- max {e : p — ea k > 0} 

4. Let j 0 2 be an index for which pj = ykOt k 

5. x <— PDcov(p — ykof , z U {j} , k + 1) 

6. If x £ S(F, z) then x <— x U {j} 

7. Return x 

The following theorem is based on the corresponding theorem from [8]. 
Theorem 4. Algorithm PDcov outputs an r-approximate solution. 

Proof. Denote the maximum depth of the recursion by t, and let z k be the 
temporary partial solution of depth k. Examine the fc’th inequality a k x > (3 k for 
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some k. This is a valid inequality with respect to (3F, z k ), and, therefore, it is valid 
with respect to T Thus, LP2 = min { px : \/k G {1, . . . , t} , a k x > /3 k and x > 0} 
is a relaxation of (/F,p). By the construction of x, it is a feasible solution for T, 
and, therefore, for LP2. Also, y is a feasible solution for the dual of LP2. 

Let x k be the value returned by the fc’th call to the recursion, let p k be the 
penalty vector of depth k, and let j(k) be the chosen element in the fc’th call. We 
prove by induction that p k x k < rJ2 l>k yib l . First, for k = t, we have p t x t = 0. 
For k <t we have, 

P k x k = (p k +' + y k a k )x k ^ p k+1 x k+1 +y k a k x k < r £ Vl (3 l +y k r(3 k = rJ2yif> 1 

l>k + 1 l>k 

where (1) is because = 0, and (2) is implied by the induction hypothesis 
and the r-effectiveness of the inequality. Finally, px = p 1 x 1 < rJ2i>iViP l < 
r • Opt(LP2) < r ■ Opt (JF,p) and, therefore, x is an r-approximate solution. 

Algorithm PDmin uses inequalities which are r-effective with respect to T, 
instead of using inequalities which are r-effective with respect to (3F, z) (as in 
the local-ratio case). It uses such inequalities, while updating p , and the dual 
solution, until a minimal zero-penalty solution is found. 



Algorithm PDmin(p) 

1. Search for a feasible minimal solution x C {j : pj = 0} 

2. If such a solution was found return x 

3. Construct a valid inequality a l x > bi which is r-effective w.r.t. to T 

4. yi <— max {e : p — ta l > 0} 

5. x «— PDmin2(p — yia 1 ) 

6. Return x 



Theorem 5. Algorithm PDmin outputs an r-approximate solution. 

5 Equivalence 

Lemma 1. ax > /3 is an r-effective inequality iff a is an r-effective weight 
function with the witness (3. 

Proof. We prove the lemma for minimization problems. Similar arguments can 
be used in the maximization case. Let ax > (3 be an r-effective inequality. By 
definition every minimal feasible solution x satisfies: (3 < ax < r/3. Thus, a is 
an r-effective weight function. On the other hand, let a be an r-effective weight 
function with a witness f3. Due to the r-effectiveness of a every minimal feasible 
solution x satisfies (3 < ax < r/3. Therefore, ax > f3 is an r-effective inequality. 

By Lemma [T] the use of valid inequalities can be explained by utilizing the 
corresponding weight functions and vice versa. Thus, Algorithms PDcov and 
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LRcov are actually identical. The same goes for Algorithms PDmin and LRmin. 
From this we can also conclude that the integrality gap of an integer program 
serves as a bound to the approximation ratio of a local-ratio algorithm which 
can be explained by algorithms LRcov or LRmin. 

6 Examples 

Generalized Hitting Set. The generalized hitting set problem is defined as follows. 
Given a collection of subsets S of a ground set U, a non-negative penalty p s for 
every set s £ S, and a non-negative penalty p u for every element u £ U, find a 
minimum-penalty collection of objects C C U U S, such that for all s £ S, either 
there exists u £ C such that u £ s, or s £ C. As in the hitting set problem 
our objective is to cover the sets in S by using elements from U. However, in 
this case, we are allowed not to cover a set s, provided we pay a tax p s . The 
hitting set problem is the special case where the tax is infinite for all sets. The 
generalized hitting set problem can be formalized as follows: 

min J2ueuP» x u + J2 s esPs x s 

S.t. ^2u£s X v. + x s > 1 Vs £ S 

Xt £ {0, 1} Vf e U U S 

Observe that paying the penalty p s is required only when s is not covered. Thus, 
12u£s x u+x s > 1 is Ag-effective for any set s £ S, where Ag = max {|s| : s £ S}. 
The corresponding Ag-effective weight function would be: S s (t) = e if t £ {s}Us, 
and S s (t) = 0 otherwise. The inequality remains Ag-effective if we use any value 
between 1 and Ag as cc s ’s coefficient. Analogously, S s (s) can take any value 
between e and Age. We can approximate this problem by using the “standard” 
approach for covering problems (when using the above inequalities or weight 
functions). However, we can also use a PDmin (or LRmin) to construct a linear 
time Ag-approximation algorithm. First, we use all the above inequalities in 
an arbitrary order; then a zero penalty minimal feasible solution can be found: 
all zero penalty elements and all the sets which are not covered by some zero 
penalty element. This would be a Ag-approximate solution. 

Feedback Vertex Set. Let G = ( V,E ) be an undirected graph, and let p be a 
penalty vector. The feedback vert.ex set problem is to find a minimum penalty set 
F C V whose removal from G leaves an acyclic graph. Becker and Geiger [ 7 ] and 
Chudak et al. [HI proved that Ylvev deg(u) — ^X^gF* deg(v) for every minimal 
feedback vertex set F, where F* is an optimal solution. Bar- Yehuda [3j indicated 
that this actually means that the weight function S(v) = deg(u) is 2-effective, 
and then used this weight function to simplify the presentation of Becker and 
Geiger’s [ 7 | 2-approximation algorithm. The corresponding 2-effective inequality 
is J2v£V deg(u)x„ > J2 V £F* deg(u). Therefore, a primal-dual analysis to this al- 
gorithm can be given by using algorithm PDcov. It is important to note that you 
do not need to know the value Y^veF- deg(v) in order to execute the algorithm. 
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Minimum s,t-cut. Given an undirected graph G = (V,E), a pair of vertices 
s, t , and a non-negative penalty for each edge e £ E, we wish to find a cut of 
minimum penalty. In order to solve this problem we solve the directed version: 
given a digraph G = (V, E) 7 a pair of vertices s and t , and a non-negative penalty 
p e for each arc e £ E, find a directed cut of minimum penalty. Denote by Q the 
set of all (directed) paths from s to t. The minimum s, f-cut problem can be 
formalized by using the following integer program: 

min 

S-t. Eeeq^e > 1 V<? € Q 
x e £ {0, 1} Ve £ E 



where x e = 1 iff e is in the cut. 

We analyze Ford and Fulkerson algorithm m ■ Examine the inequality: 
Y^e&q x e ~ Eeeg a ' e — 1 f° r some path q £ Q, where q is the corresponding 
path from t to s (i.e., the one going in the opposite direction). The correspond- 
ing weight function is: S q (e) = (|e D q\ — \e D g|) • e for some e. This means that 
arcs in the ’right’ direction are given penalty e, while arcs in the ’wrong’ di- 
rection are given penalty — e. We show that S q and its corresponding inequality 
are 1-effective. Let F be some minimal feasible solution, i.e, F is a minimal set 
of arcs which induces an s,t- cut (5,5). Any path going from s to t must go 
through F an odd number of times: k > 1 times from 5 to 5 and k — 1 times in 
the opposite direction. Therefore, S q (F) = Eeef S e = k ■ e + (k — 1) • (— e) = e. 
The algorithm works as follows: iteratively find an unsaturated path q from s to 
t and subtract S q from the p, where e is the value that saturates some arc e £ q; 
when no such path exists pick a minimal set of arcs from the set {e : p e = 0}. 
Note that, in this case, we are allowed to change our minds about an arc, i.e., 
to increase the penalty of an arc during the execution of the algorithm. 

Minimum 2 SAT. Given a 2CNF formula with m clauses on the variables 
x\,...,x n and a penalty vector p, the minimum weight 2-satisfiability problem is 
to find a minimum penalty truth assignment which satisfies tp (or determine that 
no such assignment exists). Gusfield and Pitt presented an 0{nm) time 2- 
approximation algorithm for 2SAT. We shall give an intuitive explanation of this 
algorithm in the context of the approximation frameworks. First, we can check 
whether p is satisfiable by using the algorithm from m- Therefore, we may 
assume that tp is satisfiable. In order to construct a 2-approximation algorithm 
we need to find 2-effective inequalities or weight functions. Given a literal £, let 
T(£) be the set of variables which must be assigned TRUE whenever t is assigned 
TRUE. Let Xi and Xj be variables such that Xj £ T(xl). For such variables the 
inequality Xi + Xj > 1 is valid. Moreover, it is easy to see that this inequality is 
2-effective. Constructing T(xi), and a zero penalty feasible solution can be done 
efficiently by using constraint propagation (for a more detailed analysis see i)- 

Interval Scheduling. Bar-Noy et al. [2] have presented a framework for approx- 
imating resource allocation and scheduling problems. We analyze one of the 
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algorithms from j? which approximates a packing problem called the interval 
scheduling problem. This algorithm does not fall within the frameworks discussed 
earlier, but the analysis offers some intuition to the equivalence between the two 
approaches in the maximization case. An interesting point is that while a penalty 
vector is kept non-negative during the execution of an algorithm, a profit vector 
is expected to be non-positive when the subtraction phase is over. This means, in 
primal-dual terms, that in the maximization case the dual solution is initially not 
feasible, and that it becomes feasible at the end of the algorithm. The negative 
coordinates of the profit vector correspond to the non-tight constraints. 

In the interval scheduling problem we are given a set of activities , each requir- 
ing the utilization of a given resource. The activities are specified as a collection 
of sets Ai, , A m . Each set represents a single activity: it consists of all pos- 
sible instances of that activity. An instance I £ At is defined by the following 
parameters: (1) A half-open time interval [s(I),e(I)) during which the activity 
will be executed. s(/) and e(/) are called the start-time and end-time of the 
instance; And, (2) the profit p(I) > 0 gained by scheduling this instance of the 
activity. A schedule is a collection of instances. It is feasible if it contains at most 
one instance of every activity, and at most one instance for all time instants t. 
In the interval scheduling problem our goal is to find a schedule that maximizes 
the total profit accrued by instances in the schedule. More formally, our goal 
is to find an optimal solution to the following integer program on the boolean 
variables {xi : I € Ai, 1 < * < mj. 

max J2i Pi x i 

S -t- Ef:s(!)<i<e(J) X I < 1 Vf 

T,i:ieAi X i Vte{i,...,m} 

xi £ {0, 1} Vi VI £ Ai, 

The algorithm from [2j works as follows. First, delete all instances with non- 
positive profit. If no instances remain, return the empty schedule. Otherwise, 
let J be an instance with minimum end-time, and let A(J) and I(J) be the 
activity to which instance J belongs and the set of instances intersecting J 
(including J), respectively. Define 6(1) = p(J) if I £ A(J) Ul(J), and 5(1) = 
0 otherwise. Then solve the problem recursively with respect to a new profit 
vector, p — 6. Let S' be the schedule returned. If S' U { J} is a feasible solution 
return S = S' U {J}. Otherwise, return S = S' . 5 is ^-effective, and, therefore, 
this is a ^-approximation algorithm. Bar-Noy et al. also presented a primal- 
dual interpretation to their algorithm. However, in order to do so they modified 
the original algorithm by using a different ^-effective weight function: S'(J) = 
p(J), 5' (I) = \p(J) if I £ A(J) UI(J) \ {J}, and 5' (I) = 0 otherwise. We 
present the following alternative analysis. Consider the inequality Y^iex(j) Xl + 
YlieA(j) x i — which states that only one instance from A(J) and one from 
I(J) can belong to a feasible schedule. This inequality is ^-effective because a 
maximal schedule contains at least one instance from A(J) Ul(J). The original 
algorithm can be explained by the ^-effective inequality Yliex(J)uA(J) Xl — 2- 
The difference between 6 and 5' (or between their corresponding inequalities) is 
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the ratio between the weight of I and the weights of the other instances. In fact, 
any value between 1 and 2 would have been acceptable. 
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Abstract. In this paper we study some aspects of weighted flow time 
on parallel machines. We first show that the online algorithm High- 
est Density First is an 0(l)-speed 0( l)-approximation algorithm for 
P\ri,pmtn\'^2 w iFi. We then consider a related Deadline Scheduling 
Problem that involves minimizing the weight of the jobs unfinished by 
some unknown deadline D on a uniprocessor. We show that any c- 
competitive online algorithm for weighted flow time must also be c- 
competitive for Deadline Scheduling. We finally give an 0(l)-competitive 
algorithm for Deadline Scheduling. 



1 Introduction 

We consider several aspects of online scheduling to minimize the weighted flow 
time. In this problem a sequence of jobs has to be processed on a set of m 
identical machines. The *th job has a release time 7'j, a processing time or length 
Xi and a non-negative weight Wi. At any time, only one processor may be running 
any particular job. Processors may preempt one job to run another. The *th job 
is completed after it has been run for Xi time units. The flow time F) of the 
zth job is the difference between its completion time and its release date. The 
objective function to minimize is the weighted flow time w i • -P). Following the 
standard three fields notation for scheduling problems, we denote this problem 
by P\ri,pmtn\ J2 w i ■ -Pi- 

One motivation for studying weighted flow time is that by far the most com- 
monly used measure of system performance is average flow time, or equivalently 
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average user perceived latency, and weighted flow time is an obvious general- 
ization. Another motivation for considering weighted flow time is that it is a 
special case of broadcast scheduling. A good overview of broadcast scheduling 
can be found in jT|. The setting is a client-server system where the server uses 
broadcast communication to download files to the clients. One notable commer- 
cial example of such a system is Hughes’ DirecPC system; In this system the 
clients request web pages over phone lines, and the web pages are broadcasted 
via satellite to all clients (so it may be possible to satisfy several clients request 
with a single broadcast). The weighted flow time problem l|r,;, pmtn| y Wj ■ Fj 
is a special case of the broadcast problem where all requests for file i arrive at a 
single time r* (however, there may be many requests for i at time ri). 

There have been some recent results on minimizing average flow time PEE]; 
however, minimizing weighted flow time is not yet well understood. The unipro- 
cessor offline problem l\ri,pmtn\ y Wi ■ Fi is known to be NP-hard [J]. But 
there is known no offline polynomial-time constant approximation algorithm. 
The one previous positive result for weighted flow time used resource augmen- 
tation analysis. Resource augmentation analysis was proposed [S] as a method 
for analyzing scheduling algorithms that are hard to approximate. Using the 
notation and terminology of im. an s-speed c- approximation algorithm A has 
the property that the value of the objective function of the schedule that A 
produces with processors of speed s > 1 is at most c times the optimal value 
of the objective function for speed 1 processors. An algorithm is c-competitive 
if is an 1-speed c-approximation algorithm. It was shown in m that an LP- 
based online algorithm, Preemptively-Schedule-Halves-by-Mj, is an 0(l)-speed 
0(l)-approximation algorithm. 

The first contribution of this paper, covered in Section [2] is that we simplify 
the resource augmentation analysis of l\ri,pmtn\ y WiFi given in [|l l|j , while also 
extending the analysis to multiprocessors. Namely, we show that the polynomial 
time, online algorithm Highest Density First (HDF), which always runs the job 
that maximizes its weight divided by its length, is an online 0(l)-speed 0(1)- 
approximation algorithm. While this analysis of HDF is not stated in m , upon 
reflection one can see that all the insights necessary to assert that HDF is an 
0(l)-speed 0(l)-approximation algorithm on a single processor are inherent in 
their analysis of their proposed algorithm Preemptively-Schedule-Halves-by-M, . 
Besides making this result explicit, our proof has the advantages that (1) our 
analysis is from first principles and does not require understanding of a rather 
complicated LP lower bound, and (2) our analysis also easily extends to the 
multiprocessor problem P\ri,pmtn\ y WiFi, while the result in [ TTJ apparently 
does not. 

In section[o]we consider competitive analysis of l\ri,pmtn\ y WjFj. We first 
show that every c-competitive online algorithm A has to be locally c-competitive , 
that is, at every time t, the overall weight of the jobs that A has not finished by 
time t can be at most c times the minimum possible weight of the unfinished jobs 
at time t. The requirement of local competitiveness suggests that we consider 
what we call the Deadline Scheduling Problem (DSP). 
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The input to DSP consists of n jobs, with each job i having a length Xi 
and a non-negative weight Wi. Both the length of a job and its weight are re- 
vealed to the algorithm when the job is released. The goal is to construct a 
schedule (linear order) where the overall weight of the jobs unfinished by some 
initially unknown deadline D is minimized. In the standard three field scheduling 
notation, one might denote this problem as 1| di = D\ Y^ w i ■ Ui- The require- 
ment of local competitiveness means that any c-competitive online algorithm 
for l\ri,pmtn\Y^u>iF i must be c-competitive for DSP. The DSP problem can 
be seen as a dual to the deadline scheduling problem considered in [B] - The 
setting considered in [3] was the same as DSP, except that the goal was to max- 
imize the jobs completed before the deadline D on multiple machines. In [3j it 
is shown that no constant competitive algorithm for the maximization problem 
exixts. The second main contribution of this paper is an online 0(l)-competitive 
algorithm for DSP on one machine. 

2 Resource Augmentation Analysis of P\ri, pmtn\ J2 w iFi 

In this section, we adopt the following notation. We use A(s) to denote the 
schedule produced by algorithm A with a speed s processors, and abuse notation 
slightly by using A to denote H(l). Let U A (t) be the jobs that have been released 
before time t, but not finished by algorithm A by time t. We define the density of a 
job j as the ratio p,j = Wj/pj. We use y A {t) to denote the remaining unprocessed 
length of job j at time t according to the schedule produced by algorithm A. 
Similarly, we use p A (t) = pj — yf(t) to denote the processed length of job j 
before time t. So from time t, algorithm A, with a speed s processor, could finish 
j in yf{t)/s time units. We let w A (t) = y A (t )^f be the fractional remaining 

weight of job j at time t in As schedule. We denote by W A (t) = Y!j£U A (t) w o 
and by F A (t) = J2jeu A (t) w f(t) respectively the overall weight and the overall 
fractional weight of jobs that A has not completed by time t. 

It is well known that the weighted flow time of a schedule A is equal to 
f W A (t)dt. Hence to prove that an algorithm A is a c-approximation algorithm 
for weighted flow time it is sufficient to show that A is locally c-competitive, this 
meaning that, at any time t, W A (t) < clT° pt (f). Observe that, in principle, A 
might be a c-approximation algorithm without being locally c-competitive. We 
show further in this section that this in fact is not the case for the problem we 
are considering. 

Recall that, given m processors, the algorithm Highest Density First (HDF) 
always runs the up to m densest available jobs. We now prove, via a simple 
exchange argument, that giving HDF a faster processor doesn’t decrease the 
time that HDF runs any unfinished job by any time. 

Lemma 1. For any number of processors, for any job instance, for any time t, 
and for any j £ [/ HDF ^ 1+e ^(f), it is the case that p^ DF( ~ 1+e \t) > (l + e)p™ F< ' 1 ^(t), 
or equivalalently, before time t it is the case that HDF(T + e) has run job j for 
more time than HDF^I/ 
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Proof. Assume to reach a contradiction that there is a time t and a job j where 
this does not hold. Further, assume that t is the first such time where this fails 
to hold. Note that obviously t > 0. HDF(l) must be running j at time t by the 
definition of t. Also by our assumptions, HDF(1 + e) can not have finished j 
before time f, and does not run j at time t. Hence, HDF(1 + e) must be running 
some job h at time t, that HDF(l) is not running at time t. HDF(1 + e) could 
not be idling any processor at time t since j was available to be run, but wasn’t 
selected. Since HDF(1 + e) is running job h instead of job j, job h must be 
denser than job j. Then the only reason that HDF(l) wouldn’t be running h is if 
it finished h before time t. But this contradicts our assumption of the minimality 
of t. 

We now prove that F HDF (t) is a lower bound to W° pt (t) in the unipro- 
cessor setting. (As for this issue we also refer to [3j. ) Define W FHDF (t) = 
X] je(7 HDF(i)( t ) to be the remaining fractional weight at time t for 

HDF(l). So if HDF(l) has run 1/3 of a job j by time t then j only contributes 
| Wj to W FHDF (t). 

Lemma 2. For any instance of the problem l\ri,pmtn\Y^uJiFi, and any time 
t, it is the case that F RBF (t) < F° pt {t) < W° pi (t). 

Proof. The second inequality is clearly true. To prove F ,HDF (t) < F° pt (t ) we use 
a simple exchange argument. To reach a contradiction, let Opt be an optimal 
schedule that agrees with the schedule HDF(l) the longest. That is, assume that 
Opt and HDF(l) schedule the same job at every time strictly before some time 
t, and that no other optimal schedule agrees with HDF(l) for a longer initial 
period. Let i be the job that HDF(l) is running at time t and let j be the job 
that Opt is running at time t. The portion of job i that HDF(l) is running at 
time t must be run later by Opt at some time, say s. Now create a new schedule 
Opt 7 from Opt by swapping the jobs run at time s and time t, that is Opt 7 runs 
job i at time t, and job j at time s. Note that job i has been released by time t 
since it is run at that time in the schedule HDF(l). By the definition of HDF, 
it must be the case that pi > /ij . Hence, since we are considering the fractional 
remaining weight of the jobs, it follows that for all times it, F° pt ’(zt) < F° pt (u), 
and Opt 7 is also an optimal solution. This is a contradiction to the assumption 
that Opt was the optimal solution that agreed with HDF(l) the longest. 

Observe that the above Lemma is not true for parallel machines since HDF 
may underutilize some of the machines with respect to the optimum. 

We now establish that HDF(1 + e) is locally (1 + ^-competitive against 
HDFin the uniprocessor setting. 

Lemma 3. For any e > 0, for any job instance, for any time t, and for any job 
j, it is the case that W HDF ^ 1+£ ^(t) < (1 + i)F HDF (f) 

Proof. It follows from lemma [T] that the only way that W HDF< - 1+e )(f) can be 
larger than F BDF (t) is due to the contributions of jobs that were run but not 
completed by both HDF(1 + e) and HDF. Let j be such a job. Since HDFcan 
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not have processed j for more than p™ F ^ 1+e '*(t)/(l + e) time units before time 

t , 



w 



HDF (D W >1 



HDF(l+e) 



(t) 



Pj 



l + e 



Wj. 



The ratio Wj/uv IDF< ' 1 ' ) (f) is then at most 

Pj 



Pj 



HDF(l + 6 ) 

'_2 

l+e 



(*) 



When p™ F ^ 1+e ^(t) = Pj , which is where this ratio is maximized, this ratio is 
1 + 1/e. Thus the result follows. 

We are then able to conclude the following. 

Theorem 1. Highest Density First (SDF/ is a (1 + e)-speed (1 + 1)- approxi- 
mation algorithm for the problem l\ri,pmtn\ Y2 w iFi. 

Proof. If follows from lemma[2]that F HDF (f) is a lower bound to W° pt (f). From 
Lemma El the claim follows. 

Theorem 2. Highest Density First /HDF / is a (2 + 2 e)-speed (1 + e)-approxi- 
mation online algorithm for P\ri,pmtn\ Y2 w iFi- 

Proof. Omitted. 



3 Deadline Scheduling Problem 

In this secton we first show that a c-competitive algorithm for minimizing 
weighted flow time must be also locally c-competitive. We know that local c- 
competitiveness is sufficient to establish c-competitiveness. We now show that 
for online algorithms, with speed 1 processors, it is the case that local c-compet- 
itiveness is also necessary. 

Theorem 3. Every c-competitive deterministic online algorithm A for 
l\ri,pmtn\J2 w iFi must be locally c-competitive. 

Proof. Omitted. 

The negative result shown above motivates the DSP problem: the goal is to 
design an algorithm that, at each time instant f, completes a set of jobs whose 
overall weight is a constant factor away from the set of jobs completed by an 
optimal algorithm that knows t. We stress that this is equivalent to consider a 
single deadline t unknown to the algorithm but known to the off-line optimal 
adversary. The simplification with respect to minimizing weighted flow time is 
that in DSP all jobs are released at time 0 and that we consider a simplified 
objective function. 



Online Weighted Flow Time and Deadline Scheduling 



41 



Number of Jobs 


weight 


length 


Density = weight/length 


1 


k 


k i 


1/k 


k A 


1 


1 


1 



Fig. 1. A non-trivial input for DSP. k 1. 



We now give an example to show that DSP is more subtle than one might at 
first think. Consider the following set of jobs, released at time 0: 

Two natural schedules are: (1) First run the low density job followed by the 
k 3 high density jobs, and (2) First run the k 3 high density jobs followed by 
the low density job. It is easy to see that the first algorithm is not constant 
competitive at time t = k 3 , and that the second algorithm is not constant 
competitive at time t = k 3 + k 2 — 1. In fact, what the online algorithm should 
do in this instance is to first run fc 3 — k of the high density jobs, then run the 
low density job, and then finish with the remaining k high density jobs. A little 
reflection reveals that this schedule is 0(l)-competitive for this instance of DSP. 
This instance demonstrates that the scheduler has to balance between delaying 
low density jobs, and delaying low weight jobs. Further in this section we propose 
an 0(l)-competitive algorithm for DSP. This algorithm has to recursively deal 
with difficulties similar to those that arise in the instance in table |3] It is possible 
to construct instances where one can show that essentially the full complexity 
of our algorithm description is necessary to achieve 0(l)-competitiveness. 

An alternative way to look at DSP is by reversing the time axis. This makes 
DSP equivalent to the problem minimizing the overall weight of jobs that have 
been started by some unknown deadline. This problem is more formally stated 
below. 

NDSP Problem Statement: The input consists of n jobs released at time 0, 
with known processing times and weights, plus an a priori unknown deadline t. 
The goal is to find a schedule that minimizes the total weight of jobs that have 
been started before time t including the job running at time t. Intuitively, time 
t in this schedule corresponds to time ^ Pi — t in the original schedule. 

In this section, we use the notation A(t) to refer to the schedule produced by 
algorithm A before time t, and W A (t) to refer to the weight of the jobs started 
by algorithm A before time t. (This matches the use of W A (t) done in Section 
E] to denote the weight not completed by time t in the original time direction.) 
Similarly, W° pt (t) refers to the minimum weight collection of jobs with length at 
least t, and the schedule that obtains this minimum. Note that in general there is 
not a single schedule that optimizes all V s simultaneously, that is, the schedules 
Opt(t) and Opt(f') may be very different. For a fixed instance of NDSP, the 
competitive ratio of an algorithm A is the maximum over all t £ [0, y) p 7 ] of 
W A (t)/W° pt (t). 

Observe that NDSP is simply the minimum knapsack problem, for which an 
FPTAS exists [ 13 ] ■ 

We first describe a polynomial time offline algorithm Off for NDSP, and 
then show that Off is 3-approximate with respect to Opt(f). We then pro- 
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pose a polynomial time online algorithm called R for NDSP, and show that 
R is 8-competitive with respect to Off. Hence, we can conclude that R is a 24- 
competitive algorithm. Both R and Off at various points in their execution must 
select the lowest density job from some collection; Ties may be broken arbitrar- 
ily, but in order to simplify our analysis, we assume that Off and R break ties 
similarly (say by the original numbering of the jobs). 

Description of Algorithm Off: Algorithm Off computes a set of busy 
schedules Offi(t),... ,Off. u (t). The schedule produced by Off is then the 
schedule among Offi(t),... ,Off„(f) with minimum weight. When Off(f) is 
started, J is the collection of all jobs, and in general J contains the jobs that 
maight be scheduled in the schedules Off/ 1 (t), Off/, + i(f), .... the algorithm will 
produce by the end of its execution. Variable s is initialized to s = 0, and in 
general indicates is the time that the last job in the current schedule ends. 
Variable h is initially set to 1, and in general is the index for the current 
schedule Offh(t) that Off(i) is constructing. 

1. Let j be the least dense job in J 

2. Remove j from J 

3. If s + pj < t then 

(a) Append j to the end of the schedule Off/ l (t) 

(b) Let s = s + pj 

(c) Go to step [U 

4. Else if s + pj >t then 

(a) Let Off^,+i(t) = Offh(t). Comment: Initialize Off^+i(t). 

(b) Append j to the end of the schedule Off^(t). Comment: Schedule 
Offh(t) is now completed. 

(c) Let h = h + 1 

(d) Remove from J any job with weight greater than Wj/2 

(e) If the total length of the jobs in J is at least t — s then go to step 1. 



Algorithm Off intuitively tries to schedule the least dense jobs until time 
t. The algorithm Off must then be concerned with the possibility that the job 
that it has scheduled at time f, say j, is of too high a weight. So Off recursively 
tries to construct a lower aggregate weight schedule to replace j, from those 
unscheduled jobs with weight at most Wj/ 2. Observe that algorithm Off runs in 
polynomial time. 

If a job j is scheduled in step I3al then we say that j is a non-crossing job. 
If a job j is scheduled in step I4bl then we say that j is a crossing job , and that 
afterwards a Schedule Off/ 1 (t) was closed on j. 

Theorem 4. W os (t) < 3W° pt (t). 

Proof. The jobs in the schedule Off are ordered in increasing order of density. 
Assume that the jobs in Opt(f) are ordered by increasing density as well. Let 
us introduce some notation. Let be the sequence of jobs that close 
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Fig. 2. A depiction of our notation with k = 3. 

schedules Offi(f), ..., Off u (f). Let b(fh) be the time at which j h is started in 
Off? t (£)- This notation is illustrated in figure 2. 

We break the proof into two cases. In the first case we assume there is some 
schedule Off/j(f) that runs at some time instant a job of density higher than the 
job run at the same time by Opt(f). Denote by x the earliest such time instant. 
Since jobs scheduled by Opt are ordered by increasing density, we assume w.l.o.g. 
that Off/ 1 (£) starts such job at time x. Denote by j the job started by Off/j(£) at 
time x and by i the job run by Opt at time x. Clearly, g,j > gi. It is easy to see 
that Offi(f) is always running a job with density at most the density of the job 
that Opt(f) is running at that time. Hence, it must be the case that h > 1. Since 
i ^ j, there must be some job g run by Opt(£) before i, that Offh(f) does not 
run before j. Note that it may be the case that g = i. Then the only possible 
reason that Off(f) didn’t select g instead of j, is that g was eliminated in step 
l4dl before Off(f) closed a schedule on some /<*, d < h — 1. Hence, w g > Wf h _ 1 /2, 
and u!f h _ 1 < 20pt(£) since w g is scheduled in Opt(£). By the minimality of x, 
it must be the case that, at all times before time b(fh- i), the density of the 
job that Offft_i(£) is running is at most the density of the job that Opt(f) is 
running. Hence, the aggregate weight of the jobs in Off/j_i(f), exclusive of fh-i, 
is at most Opt(f). We can then conclude that Off/ l _i(£) < 30pt(f), and hence 
Off'(t) < 30pt(f) by the definition of Off. 

In the second case assume that, at all times in all the schedules Off^(£), it is 
the case that the job that Off^f) is running at this time is at most the density of 
the job that Opt(£) is running at this time. Hence, the weight of the jobs in the 
last schedule Off u (f), exclusive of f u , is at most Opt(f). Consider the time that 
Off(£) adds job f u to schedule Off u (f). The fact that Off„(f) is the last schedule 
produced by Off(f) means that the total lengths of jobs in Off u (t), union the 
jobs that are J after the subsequent step ldcll is strictly less than t. The jobs in J 
at this time, are those jobs with weight at most Wf u / 2 that were not previously 
scheduled in Off u (t). Hence, the total length of the jobs in the original input 
with weight at most Wf u /2 is strictly less than t. Therefore at least one job in 
Opt(t) has weight at least Wf u / 2. Hence, Wf u < 20pt(f). Therefore, once again 
we can conclude that Off u (f) < 30pt(f), and that Off(f) < 30pt(£). 
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We now turn to describing the algorithm R. R. is an on-line algorithm that 
produces a complete schedule of J without knowing the deadline t. The cost 
incurred by R will be the total weight of jobs started by time t. Intuitively, the 
algorithm R first considers the lowest density job, say j. However, R can not 
immediately schedule j because it may have high weight, which might mean that 
R wouldn’t be competitive if the deadline occurred soon after R began executing 
j. To protect against this possibility, R tries to recursively schedule jobs with 
weight at most Wj/ 2, before it schedules j, until the aggregate weight of those 
jobs scheduled before j exceeds Wj. We now give a more formal description of 
the algorithm R. 

Description of Algorithm R: The algorithm R takes as input a collection 
J of jobs. Initially, J is the collection of all jobs, and in general, J will 
be those jobs not yet scheduled by previous calls to R. The algorithm R is 
described below: 

While 

1. Select a j € J of minimum density. Remove j from J. 

2. (a) Initialize a collection 2 j to the empty set. I :j will be the jobs scheduled 

during step [2] 

(b) Let Jj be those jobs in J with weight at most Wj/2. 

(c) If Jj is empty then go to step El 

(d) Recursively apply R to Jj. Add the jobs scheduled in this recursive 
call to Ij. Remove the jobs scheduled in this recursive call from J . 

(e) If the aggregate weight of the jobs in Ij exceeds Wj then go to step 
El else go to step 123 

3. Schedule j after all jobs of Ij. 

End While 

We say a job j was picked if it was selected in stepQ] If, after j is picked, a 
recursive call to R is made with input Jj , we say that R recurses on j. Notice 
that like the algorithm Off, the algorithm R also runs in polynomial time. 

We now explain how to think of the schedule R as a forest, in the graph 
theoretic sense. The forest has a vertex for every job in J. For this reason we 
will henceforth use the terms job and vertex interchangeably. The jobs in the 
subtree T(j) rooted at job j are j and those jobs scheduled in the loop on step |2] 
after j was selected in step[f] The ordering of the children is by non-decreasing 
density that is the order in which R selects and schedules these jobs. The roots 
of the forest are those vertices selected in step [T| at the top level of the execution 
of the algorithm. Similarly, the ordering of the roots of the forest is also by non- 
decreasing density, and in the order in which R selects and schedules these jobs. 
We adopt the convention that the orderings of trees in the forest, and children 
of a vertex, are from left to right. 

Before proceeding we need to introduce some notation. Given a descendent 
y of a vertex x in some tree, we denote by P{x , y) the path from x to y in T, x 
and y inclusive. If y is not a descendent of x, then P(x, y) is the empty path. We 



Online Weighted Flow Time and Deadline Scheduling 



45 




Fig. 3. Forest structure of i?’s schedule. 



denote by r\, . . . , the roots of the trees in R that contain at least one job that 
R started before time t. Let . . . , be the vertices in the path P(rk, v m ), 
where v\ = rk, and where v m is the job that R is running at time t. Finally, 
denote by W(S) the total weight of the jobs in a set S. See Figure [o] for an 
illustration of the forest structure of R and some of our definitions. 

The following facts can easily be seen to follow from the definition of R. 
The post-order of the jobs within each tree is also the order that these jobs are 
scheduled by R. All jobs in a tree T(rh), h < k, are completed by R before time 
t. All jobs in a tree T(rh), h > k, are not started by R before time t. The jobs 
of T(rk) that are started by R before time t are: (i) The jobs in T(v m ), and (ii) 
the jobs in T(£), for each left sibling £ of a job on P(v 2 , v m ). All the jobs in the 
path P(rk,v m - 1 ) are not started by R before time t. All the jobs in a T(£), £ 
being a right sibling of a job on the path P(v 2 , v m ), are not started by R before 
time t. 

The following lemma shows that the aggregate weight of the vertices in a 
subtree is not too large. 

Lemma 4. For any job x in R, W(T(x)) < 4w x . 

Proof. The proof is by induction on the height of T(x). The base case (i.e. x 
is a leaf) is straightforward. Let y be x’s rightmost child. The aggregate weight 
of the subtrees rooted at the left siblings of y is at most w x , or R would not 
have picked y in tree T(x). Since R removes jobs of weight greater than w x /2 in 
stepEB we know that w y < w x /2. By induction W(T(y)) < Aw v , and therefore 
W(T(y)) < 2w x . Hence, by adding the weight of the subtrees rooted at x’s 
children to the weight of x, we get that W(T(x)) < w x + w x + 2 w x = 4 w x . 



Theorem 5. R(t) < 80ff(£). 
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Proof. We will charge all the weight of the jobs started by R before time t to 
the jobs scheduled by Off in a way that no job in Off is charged for more than 
8 times its weight. 

First notice that Off completes the roots ri,... , rk-i by time t. This is 
because t occurs in tree T(rfc), all jobs in T(r/j), h> k have density not smaller 
than fjL rk , and therefore the total length of jobs of density less than fj, rk is less 
than t. We then charge W'(T(r,)) to each r, scheduled by Off, 1 < i < k — 1, 
and therefore by Lemma 0 we charge each ri at most four times its weight. 

We are left to charge the weight of the jobs in T(r^ ) started by R before time 
t. We consider three cases: 

(i) Off schedules job rk = v i; 

(ii) Off schedules a job on path P(v i,v m ); 

(iii) Off schedules no job on path P(v i,v m ). 

In case (i), we charge W(T(rk)) to rk- This way rk is charged at most four 
times its weight. So now assume that rk is picked but not scheduled by Off. To 
finish our analysis of case (ii) and case (iii) we need the following claim, which 
illuminates the similarities between the algorithms R and Off. 

We slightly abuse notation by denoting n, ...,Tk-i as the left siblings of r*,. 

Claim. Assume Off schedules no job on P(v i, U&), b £ {1, .., m}. Then, for every 
c = 1 , ...,&, v c is the crossing job that closes Off c (f), and Off c (f) contains all 
left siblings of Vi, l = 1, c, that are scheduled by R before time t. Moreover, if 
b = m then Off schedules all the children of v m . 

Proof. We prove a stronger statement: in particular, we prove that for every 
c = 1, ..., 6, v c is the crossing job that closes Off c (f), Off c (t) contains, for every 
l = 1, .., c, all left siblings of vi and possibly other jobs, but only from the trees 
rooted at the left siblings of vi. 

The proof is by induction. For the basis of the induction observe that the 
overall length of jobs of density less than ji vi is less than t while Off does not 
schedule i>i, therefore is the crossing job that closes Offi(f), that in turn also 
contains all the root vertices r\,...,rk-i and possibly other vertices, but only 
from T(ri), ...,T(rk- 1 ). For the inductive hypothesis, we assume that the claim 
is true up to vertex v c that closes solution Off c (f). We then prove the claim for 
vertex u c +i. Off c (f) was closed on vertex v c . Solution Off c +i will contain all the 
jobs of density higher than but less than /i Vc+1 , and weight at most w Vc / 2. 
Also these jobs are contained in trees that are rooted for some l £ {l,..,c} at 
some left siblings of vi. Therefore Off c +i will contain all these jobs including 
all left siblings of v c +i, since the limit t won’t be reached. Therefore u c +i will 
eventually be selected to be included in Off c+ i(f) but since it is not scheduled 
by Off it will be the crossing job that closes Off c _|_i(t). 

Finally, if b = m then by the same argument as above it follows that Offm+i 
contains all the children of v m and none of these jobs is a crossing job, therefore 
they are all included in the final solution provided by Off. 

The following corollary immediately follows from Theorems |T] and 0 by re- 
versing the schedule given by R. 
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Corollary 1. There exists a 24~ competitive algorithm for DSP. 

4 Conclusions 

In this paper we considered several aspects of the problem of minimizing the 
weighted flow time of a set of jobs on identical machines. 

We interpret the results in this paper as suggesting that a reasonably simple 
1-speed 0(l)-competitive algorithm for the problem of minimizing the average 
broadcast time of a set of requests is unlikely, since minimizing the weighted 
flow time is a particular case of broadcast scheduling to minimize the average 
flow time. 
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Abstract. The Postman Problem is defined in the following manner: 
Given a connected graph G and a start point s £ U(G), we call a path 
J in G a postman tour if J starts and ends at s and contains every 
edge in E(G). We want to find a shortest postman tour in G. Thus, 
we minimize the number | J\ — |J?(G)|, called the penalty of J for G. In 
this paper, we consider two online versions of The Postman Problem, 
in which the postman is initially placed at a start point and allowed 
to move on edges. The postman is given information on the incident 
edges and/or the adjacent vertices of the current vertex v. The first 
model is called The Labyrinth Model, where the postman only knows 
the existence of the incident edges from v. Thus, he/she does not know 
the other endpoint w of an edge from v, even if it is already visited. 
The second one is The Corridor Model, where the postman can ‘see’ 
the other endpoint w from v. This means that if w is already visited, 
he/she knows the existence of w before traversing the edge (v,w). We 
devised an algorithm for The Corridor Model whose penalty is bounded 
by 2|U(G)| — 2. It performs better than the algorithm described in 0 , in 
the case when the other visited endpoint of an edge from a current vertex 
is known to the postman. In addition, we obtained lower bounds of a 
penalty for The Corridor and Labyrinth Models. They are (4|U(G)| — 5)/3 
and (5| V (G) | — 3)/4, respectively. Our online algorithm uses a linear time 
algorithm for the offline Postman Problem with a penalty at most | V| — 1. 
Thus, the minimum penalty does not exceed | V| — 1 for every connected 
graph G. This is the first known upper bound of a penalty of an offline 
postman tour, and matches the obvious lower bound exactly. 



1 Introduction 

The Postman Problem is defined in the following manner: Given a connected 
graph G and a start point s £ V(G), we call a path J in G a postman tour 
if J starts and ends at s and contains every edge in E{G). We want to find a 
shortest postman tour in G. Thus, we minimize the number | J| — |Ff(G)|, called 
the penalty of J for G. 

In this paper, we consider two online versions of The Postman Problem. In 
an online Postman Problem, the postman is initially placed at a start point and 
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allowed to move on edges. The postman is given information on the incident 
edges and/or the adjacent vertices of the vertex v, at which he/she currently 
locates. 

There are several models worth considering that differ in the information 
given to the postman at v. We consider the following two of them. 

1. The Labyrinth Model 

The postman only knows the existence of the incident edges from v. Thus, he/she 
does not know the other endpoint w of an edge from v, even if it is already visited. 

2. The Corridor Model 

The postman can ‘see’ w from v. It means that if w is already visited, he/she 
knows the existence of w before traversing the edge ( v,w ). 

The offline Postman Problem is known to be polynomially computable ma- 
in 0, an algorithm for The Labyrinth Model based on the depth first search 
scheme (dfs) is given, so that a generated penalty is bounded by 3|F(G)|. Note 
that an online algorithm with an O (|P(G)|) penalty is not obviously obtained, 
while 2|I?(G)| penalty can be achieved by the dfs algorithm. [8] also mentions 
that any simple heuristics using a dfs or greedy scheme cannot guarantee an 
0(|P(G)|) penalty. 

Typical applications of the online Postman Problem are the exploration and 
navigation problems for a robot in an unknown environment. Most of the known 
results 1112J3|4|5161I|,9,| make use of geometric restrictions of the environments, 
not viewing the problems graph-theoretically on arbitrary undirected graphs. 

We will present an algorithm for the Corridor Model with a penalty at most 
2|P(G)| — 2. Thus, we will achieve a better performance than |8], in the case 
when the other visited endpoint of an edge is known to the postman. In addi- 
tion, we will show the lower bounds (4|P(G)| — 5)/3 and (5|P(G)| — 3)/4 of a 
penalty generated by an algorithm for The Corridor and The Labyrinth Models, 
respectively. 

Our algorithm for The Corridor Model uses a linear time algorithm for the 
offline Postman Problem with a penalty at most |V(G)| — 1. Although we already 
have an exact polynomial time algorithm for the offline problem, nothing is 
known about a penalty upper bound for an arbitrary connected graph. Our 
algorithm is faster than the optimal algorithm that runs in super linear time 
using maximum matching. It should be noted that the upper bound |P(G)| — 1 
equals the obvious lower bound, since any postman tour on a tree has size at 
least 2\E(G)\. 

The rest of this paper consists of four sections; Section 2 defines general 
terminology. Section 3 describes our algorithm for the Corridor Model. The lower 
bounds and conclusions are provided in Sections 4 and 5, respectively. 



2 General Definitions 

When we consider a graph G = (V, E ) that the postman is exploring, let H 
denote the subgraph of G consisting of the visited vertices and the traversed 
edges. A vertex v € V is called saturated if every edge incident to v is in H. 
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If the postman traverses an edge for the second or more time, it is a penalty 
traversal. 

3 An Algorithm for the Corridor Model 
with a Penalty at Most 2\V\ — 2 

In this section, we show an algorithm for the Corridor Model with a penalty not 
exceeding 2\V\ — 2. First, we need the following theorem. 

Theorem 3.1 The minimum penalty of a postman tour for a graph G is at most 

|P(G)|-1. 

Proof. We claim that the following algorithm using Divide and Conquer com- 
putes a postman tour with a penalty at most \V\ — 1 for an undirected graph. 

Algorithm OfflinePostman(G, s) 

Input: A connected graph G = (V,E) and a start point s G P(S). 

Output: A postman tour J of G with a penalty |Vj — 1 or less. 

Find a cycle C = (ci, C 2 , . . . , c{\ in G. If G is a tree, the dfs algorithm gen- 
erates J . Remove the edges in C from G, which leaves connected components 
Si, S 2 , , Sk ■ Without loss of generality, assume c* G V(Si) for each i G (1, 2, 

. . . , k} where k < j , and s G V (Si). 

In the divide phase, we recursively compute Ji = OfflinePostman(Si, s) and 
Jj =OfflinePostman(Si , cf) for i G (2, . . . , k}. In the conquer phase to construct 
J, we attach C and Ji to Ji. More precisely, we modify J\ so that when the 
postman reaches ci, lre/slre enters C to visit C 2 and moves on J 2 . When the 
postman comes back to C 2 , he/she continues to move on C to visit C 3 , . . . , c/ 
similarly. The postman continues the rest of the traversal on Ji at ci . □ 

We prove by induction on \V\ that the penalty of J is at most |Vj — 1. The 
basis is a case when G is a tree. Assume true for every connected graph with the 
number of vertices less than |Vj, prove true for |Vj. 

Every edge in G belongs to either C or some S';. The edges in C generate no 
penalty traversal. Thus, every edge with a penalty traversal must belongs to Si, 
which implies 

P(J) = P(Ji) + P(J 2 ) + ... + P(J k ), 

where P is a function that returns the penalty of a postman tour. By induction 
hypothesis and the fact that J,; are vertex disjoint, 

k k 

p(j ) = E p w ^ E - !) < m - L 

i= 1 i=l 

This proves the induction step. □ 

Algorithm OjflinePostman runs in linear time. For, we use the dfs algorithm. 
When we find a cycle C by detecting a back edge, we delete the edges in G. 
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Start the dfs algorithm at each vertex in G, marking an edge by C and the 
direction of its first time traversal. When the traversal of G is complete, recover 
the recursive structure of the obtained postman tours by the mark. Construct a 
desired postman tour for G in linear time. 

The following theorem is our main claim. 

Theorem 3.2 There exists an online algorithm for computing The Corridor 
Model of The Postman Problem that generates a penalty at most 2\V (G)| — 2 for 
any connected graph G. 

Proof. The following is our algorithm. We think that one step of the algorithm 
is completed when an edge is traversed by the postman. 

Algorithm OnlinePostman 

Input: The postman initially locates at a vertex s in a connected graph G = 
(V,E). He/she will explore the entire graph to return to s. At a current vertex 
v, every edge e from v and the other visited endpoint of e are known to the 
postman. The explored subgraph H is memorized by the postman. 

Output: A postman tour of G with a penalty at most 2|P| — 2. 

Run the dfs algorithm. We color edges and vertices in order to imply some 
traversal status. If we color an edge or a vertex in red, it is complete in the dfs 
part of the exploration. 

If the postman finds an uncolored edge e from the current vertex v such that 
the other endpoint of e is not in H , he traverses and colors it in white. If he 
finds an uncolored e with the other endpoint in H , he does not traverse it at the 
moment and colors it in blue. 

When every edge from v is colored, let G v be the subgraph of G, which 
contains all the vertices reachable from v using only blue edges, and the blue 
edges. If every w £ V(G v ) — {u} is colored in red, the postman explores G v by 
OfflinePostman(G v ,v). 

There exists a simple white path P v that connects s and v. The postman 
colors v in red and moves back on the edge incident to v in P v . He/she colors 
the edge in red. 

Continue the above until the postman returns to s. □ 

Lemma 3.3 At any step of the algorithm, the collection of the red and white 
edges forms a tree that spans over V(H). Moreover, the set of the white edges 
forms a simple path between s and v. 

Proof. The edges form a connected component that spans over V(H), since 
they are included in the trajectory of the walk so far. They do not contain a 
cycle, since if the other endpoint of an edge from v is already visited, the edge 
is colored in blue. The second statement is verified by induction on the number 
of steps executed. □ 

Lemma 3.4 Algorithm OnlinePostman finishes when every edge in G is tra- 
versed and the postman returns to s. 
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Proof. Suppose that it does not finish. Let G' be the subgraph of G induced 
by all the vertices visited by the postman in infinitely many steps or until the 
algorithm halts unexpectedly. The collection of the red edges forms a tree that 
spans over V(G'), because G' contains no white edges. 

Since G is connected, there exists an unsaturated vertex v in G' . Let e be 
an untraversed edge incident to v. It is a blue edge so that the other endpoint 
belongs to G' . For, e must be colored due to v £ V{G'). It is colored in blue, 
else it is a red edge that is a witness of e being traversed. 

The blue edges in G' form connected components. One of them C contains 
e. Every vertex in C is incident to both a blue edge and a red edge. There exists 
a step when the last white edge in C turns red. Immediately, OfflinePostman 
would have been issued for the postman to traverse e. Therefore, such e cannot 
exist. 

Since there is no white edge in G' = G, the postman must return to s. This 
correctly finishes the algorithm. □ 

Lemma 3.5 The generated penalty is bounded by 2\V\ — 2. 

Proof. The number of penalty traversals on the red edges is at at most |Vj — 1 
due to Lemma 13.31 We show that the the number of penalty traversals on the 
blue edges is at most \V\ — 1. Let G Vl ,G V2 , . . . , G Vk be the collection of G v for 
which OfflinePostman is issued. It suffices to show that they are vertex disjoint 
subgraphs of G. 

Each G Vi is a component connected by blue edges. Suppose there exists a 
vertex 

w £ V ( G Vl ) n V ( G V2 ) . 

When OfftinePostman(G Vl ,vi) is issued before OfftinePostman(G V2 ,V 2 ), w must 
be colored in red if w ^ v\. Whether w = v\ or not, the blue edges in G V2 incident 
to w have been already colored at the step. Then, OfflinePostman at vi would 
not have been issued for G Vl . Therefore, such w is impossible to exist. □ 

The above sequence of lemmas proves the theorem. □ 

4 Lowerbounds for the Corridor Model 
and the Labyrinth Model 

In this section, we prove lower bounds of a penalty generated by any algorithms 
for the both models. 

Theorem 4.1 No algorithm for The Corridor Model of The Postman Problem 
guarantees a penalty less than j or an arbitrary connected graph G. 

Proof. We construct a graph Gj for a given positive integer j, by the following 
procedure. Call the next graph a unit. 

s = ({p(s), 6(s), ci(s), c 2 (s), c 3 (s)}, (0(s), ci(s)), (ci(s), b{s)), (b(s), c 2 (s)), 

(c 2 (s),p(s)),(ci(s),c 3 (s))}) 
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where p(s), b(s) and Cj(s) are vertices. There are j units in total, say si, S 2 > • • • > Sj. 
Si for each i £ {2,3, ... ,j — 1} is to be connected to the units Sj_i and Sj+i 
at p(si) and b(si), respectively. p(si) is identified with 6 (sj_i), and 6 (sj) with 
p(si+i). b(sj) is connected to a vertex w by an extra edge, and p(s i) is the start 
point of the exploration. 

Lemma 4.2 Any algorithm generates at least five penalty traversals in each s. 

Proof. The postman always enters each s at p(s) for the first time. Due to the 
definition of The Corridor Model, it is possible that he/she reaches C 2 (s) without 
saturating ci(s) and b(s). The postman must re-visit the two vertices. 

If the postman traverses the edge (c 2 (s),p(s)) at the next step, it takes at 
least five penalty traversal to visit b(s), Ci(s) and C3(s) to come back to p(s) 
once again. If the postman traverses (c 2 (s), 6 (s)) at the next step, he/she needs 
to visit every Cj(s). This takes at least five penalty traversals also. □ 

\V(Gj)\ and \E(Gj)\ are 4 j + 2, and 5 j + 1, respectively. The number of penalty 
traversals is at least \E(Gj)\ due to the lemma. Thus, we have a penalty for Gj 
at least 



5 j + 1 
4 j + 2 



• I V(Gj)\ 



|(4j+2)- 
4 j + 2 



l\V(G,)\- 



3 

2 



3 

2 



•im)l 



where \V(Gj)\ is arbitrary large. □ 

The lower bound 4 ^~ 5 for The Labyrinth Model is obtained, by a similar 
way that changes a unit s into a ‘square’, i.e., obtained by eliminating C 3 (s) from 
s. The only difference is that in The Labyrinth Model, the postman possibly 
explores every vertex in s at the first time visit to come back to p(s) without 
saturating b(s). 

If a multiple edge is allowed for The Labyrinth Model, we can obtain a greater 
lower bound 2\V\ — 1. It is doable by further shrinking every s into a cycle of 
length two. In other words, if one comes up with an algorithm for The Labyrinth 
Model with a penalty less than 2\V\ — 1, it should use the fact that the given 
graph is free from multiple edges. 



5 Conclusions 

A 2|V| — 2 penalty algorithm for The Corridor Model has been presented, and 
some penalty lower bounds for the both models are obtained. In addition, the 
tightest upper bound of a penalty for the offline problem is shown. The result is 
summarized in Table [B 

This also poses an open problem asking how close the both bounds can 
be in the two models. We doubt the existence of a 2\V\ penalty algorithm for 
The Labyrinth Model, even in multiple edge free graph. What prevents us from 
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Table 1. Summarizing the Obtained Bounds 





Offline 


The Labyrinth Model 


The Corridor Model 


Upper Bound 


\v\-i 


3| V| * 


2\V\ -2 


Lower Bound 


\v\-i 







* Due to [8j ** 2\V\ — 1 if multiple edges are allowed, 



matching the both bounds seems similar in the both models. We think that it 
is a kind of ‘algorithm dependent bottlenecks’, meaning that there are distinct 
types of lower bound graphs for distinct exploration strategies. Therefore, we 
feel that if one model is improved, so is the other. 
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Abstract. We present a simple dual ascent method for the multilevel 
facility location problem which finds a solution within 6 times the op- 
timum for the uncapacitated case and within 12 times the optimum for 
the capacitated one. The algorithm is deterministic and based on the 
primal-dual technique. 



1 Introduction 

An important problem in facility location is to select a set of facilities, such as 
warehouses or plants, in order to minimize the total cost of opening facilities 
and of satisfying the demands for some commodity (see Cornuejols, Nemhauser 
& Wolsey |GNW 90 1 1 . 

In this paper we consider the multilevel facility location problem in which 
there are k types of facilities to be built: one type of depots and (k — 1) types 
of transit stations. For every type of facility the opening cost is given. Each 
unit of demand must be shipped from a depot through transit stations of type 
k— 1, . . . , 1 to the demand points. We assume that the shipping costs are positive, 
symmetric and satisfy the triangle inequality. The goal of the problem is to select 
facilities of each type to be opened and to connect each demand point to a path 
along open facilities such that the total cost of opening facilities and of shipping 
all the required demand from depots to demand points is minimized. 

Being an extension of the uncapacitated facility location problem, which is 
known to be Max SNP-hard (see |OK98j and [S97j ). this problem is Max SNP- 
hard as well. The first approximation algorithms for the multilevel facility loca- 
tion problem were developed by Shmoys, Tardos & Aardal [ STA97] and Aardal, 
Chudak & Shmoys , ACS99j and were based on rounding of an LP solution to 
an integer one. The performance guarantees of these algorithms were 3.16, re- 
spectively 3. The first combinatorial algorithm for the multilevel facility location 
problem was developed by Meyerson, Munagala & Plotkin |MMP00] . and finds 
a solution within O (log |Z?|) the optimum, where D is the set of demand points. 

Using an idea from jJV99| . we present a simple greedy (dual ascent) method 
for the multilevel facility location problem that finds a solution within 6 times 
the optimum. The algorithm extends to a capacitated variant of the problem, 
when each facility can serve only a certain number of demand points, with an 
increase of the performance guarantee to 12. 
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2 The Metric Multilevel Uncapacitated Facility 
Location Problem 



Consider a complete (k + 1 )— partite graph G = ( V. , E) with F = VoU. . .UFfc and 
k 

E = |J F- i x F- The set D = F 0 is the set of demand nodes and F = Fill. . .UF/ C 

i=i 

is the set of possible facility locations (at level 1, . . . ,k). We are given edge costs 
c £ Rf and opening costs f £ Rf ( i.e., opening a facility at i £ F incurs a 
cost fi > 0). We assume that c is induced by a metric on V. Without loss of 
generality we can assume that there are no edges of cost 0. 

Remark 1. Our results also hold in a slightly more general setting, where we 
require only for e £ Vo x Fi that c(e) < c(p) for any path p joining the endpoints 
of e. 

Denote by P the set of paths of length k — 1 joining some node in Fi to 
some node in Fc. If j £ D and p = (tq, . . . , Vk ) £ P, we let jp denote the path 
(j,v i,... ,Ufc). As usual c{p) and c(jp) denote the length of p resp. jp (with 
respect to c). 

The corresponding facility location problem can now be stated as follows: 
Determine for each j £ D a path pj £ P (along ’’open facilities”) so as to 
minimize 



c fe) + /( U u)- 

j£D j£D 



Remark 2. In this setting we assume that each j £ D has a demand of one unit 
to be shipped along pj. Our results easily extend to arbitrary positive demands. 

To derive an integer programming formulation of the multilevel facility lo- 
cation problem, we introduce the 0 — 1 variables yi ( i £ F) to indicate whether 
i £ F is open and the 0 — 1 variables Xj P ( j £ D, p £ P) to indicate whether j 
is served along p. 

We let 



=(*) := EE 



p£P jGD 



f(y) : = f iVi ■ 

i£F 



and 
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The multilevel facility location problem is now equivalent to 





minimize c(x) + f(y) 








subject to Xj P = 1, 


for each j £ D 


(1) 




p€P 






(Pint) 


^ ] Xjp ^ Vii 


for each i £ F, j £ D 


(2) 




p3i 








X PJ ^ {^7 1} J 


for each p £ P, j £ D 






Vi G {0, 1} , 


for each i £ F 




Constraints (1) 


ensure that each j gets connected via some path and 


con- 



straints (2) ensure that the paths only use open facilities. 

The LP — relaxation of (Pint) is given by 

minimize c( x) + f(y) 

(P) subject to (1), (2) 

Xj p > 0 

Vi > 0 ■ 

Note that Xj P < 1 is implied by (1) and yt < 1 holds automatically for any 
optimal solution (x,y) of (P). 

The standard way of proving a 0 — 1 solution (xpy) of (Pint) to be a p— 
approximation is to show that 

c(x) + f(y) < pC LP (2.1) 

where Cpp is the optimum value of (P). 

3 The Primal-Dual Algorithm 

The basic idea of the primal-dual approach is to exhibit a primal 0 — 1 solution 
(x,y) satisfying (2.1) by considering the dual of (P). Introducing dual variables 
Vj and tij corresponding to constraints (1) and (2) in (P), the dual becomes 

maximize Vj 

jpD 



Vj — ^ tij < c(jp ) , for each p £ P, j £ D 


(3) 


i€p 




t^ < fi, for each i £ F 


(4) 


jpD 




tij >0, for each i £ F, j £ D 





Intuitively, the dual variable Vj indicates how much j £ D is willing to pay 
for getting connected. The value of tij indicates how much j £ D is willing to 
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contribute to the opening cost fi (if he would be connected along a path through 

i). 



We aim at constructing a primal feasible 0 — 1 solution (x, y ) and a feasible 
dual solution ( v , t) such that 

c(x)+f(y) <6 ^vj, 

jGD 

implying (2.1) for p = 6. 

We first describe how to construct the dual solution (v,t). To this end, we 
introduce the following notation w.r.t. an arbitrary feasible solution (v, t) of ( D ): 

A facility i £ F is fully paid when 

E = /'• ('•’>•') 

jeD 

A demand point j £ D reaches ii £ V) if for some path p = (i ls . . . ,i{) from 
Id to ii all facilities i\, . . . ii~\ are fully paid and 

v j = c jp (^-2) 

i€p 

If, in addition, also ii is fully paid, we say that j leaves ii or, in case l = k, 
that j gets connected (along p to ik £ 14). 

Our algorithm for constructing the dual solution is a dual ascent method, 
generalizing the approach in [,TV99| . We start with v = t = 0 and increase all 
Vj uniformly ( ’’with unit speed” ). When some j £ D reaches a not fully paid 
node i £ F, we start increasing tij with unit speed, until /, is fully paid and j 
leaves i. We stop increasing Vj when j gets connected. The algorithm maintains 
the invariant that at time T the dual variables Vj that are still being raised are 
all equal to T. More precisely, we proceed as described below. 

UNTIL all j £ D are connected DO 

• Increase Vj for all j £ D not yet connected 

• Increase tij for all i £ F , j £ D satisfying (i) — (Hi), 

(■ i ) j has reached i 

(ii) j is not yet connected 

(Hi) i is not yet fully paid. 

Let (v, t) denote the final dual solution. Before constructing a corresponding 
primal solution (x,y), let us state a few simple facts about (v,t). 

For each fully paid facility i £ VJ, l > 2, denote by T) the time when facility 
i became fully paid. The predecessor of i will be the facility in the level l — 1 via 
which i was for the first time reached by a demand point, i.e., 

( , 1 

Pred (i) = < % £ V)_i| i! is fully paid and T. f + c^i = min (Tin + Ci"f) > . 

1 i"eVi-i 

| // 

V i fully paid 
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(Ties are broken arbitrarily.) 

The predecessor of a fully paid facility i £ V\ will be its closest demand 
point. We can define the time T Pred ^ = 0. 

For all fully paid facilities i in the k — th level denote by ji Pi = (ii, . . ■ , ik ) 
the path through the following points: 

• ik = i 

• ii =Pred(ii + i), for each 1 < l < k — 1 
•ji =Pred(i 1 ). 

We will call the neighborhood of i the set of demand nodes contributing to pi 



Ni = {j £ D | t P j > 0 for some i! £ pi} . 

Since each j £ D gets connected we may fix for each j £ D a connecting 
path pj £ P of fully paid facilities (ties are broken arbitrarily) . 

Lemma 1. (i) c(jPj) < Vj for all j £ D 

(ii) For all j £ D and i £ Vk fully paid such that i £ pj, either Vj = Ti and 
Uj > 0 or Vj > Ti and tij = 0 

(in) For all fully paid facilities i 6 Vk and corresponding paths Pi = (ii, • ■ ■ , 
ik), the following relation holds 



Ti i < • • • < Ti k 

(iv) Let i £ Vk be a fully paid facility and Pi = (iq . . . , ik) its associated path. 
For all j £ D and ii £ pi with t^j > 0, there exists a path p from Vito ii such 
that 

k - 1 

c Up) + c bb+i - Ti ■ 

S—l 

In particular, c(jiPi) < I) 

(v) If i,i' are two fully paid facilities in Vk with intersecting neighborhoods 
then for each j' £ D, such that i' £ pf' , Cj i ji < 4 max {Tj : , Vj> } 

(vi) tpj < v j f or oil j G D 

i'€pi 

Proof. The first claim is straightforward from (3.2) and the definition of pj. 

The second claim is based on the observation that at time T all the v — values 
that can be increased are equal with T and that the final v — values reflect the 
times when the demand points get connected. There are two possibilities that a 
fully paid facility i £ Vk is on the connecting path of a demand point j. One is 
that j reached i before Ti and got connected when i became fully paid. In this 
case Uj > 0 and Vj = Tj. The other possibility is that j reached i after i was 
fully paid, which means that t l3 = 0 and Vj > Ti. 

The definition of a predecessor implies that for each fully paid i £ F 



CPred(i)i T Ppred,(i) — Tj . 



( 3 . 3 ) 
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The third claim follows immediately. 

For the forth claim, by adding the inequalities (3.3) for q +1 ,... ,ik-i one 
obtains 

k - 1 

'y 1 C i 3 is+1 "b ^ 2) fe . 

S = l 

Since j > 0, there is a path p along which j reached i; before 7 ), . Clearly, 
c(jp) < 7) ( , which implies (iv) . 

For proving (v), let j G A) (~l A)/. Since j G A), there is an ii G pi such that 
t H j > 0. Then by (iv), there exists a path q from V\ to ii such that c(jq) < Ti. 

Suppose pi ' = (i'i, ■ ■ ■ ,i' k ). Similarly, there is an i! r G p,/ and a path q' from 

fc-i 

V\ to i' r such that c(jq') + Cj/y < Ty. 

s=r 

Using the triangle inequality and (ii), we obtain 

k - 1 

cpy < c tiiPi) + c U<l) + c{jq') + c i' s i's + i + 

s=r 

< 27) + Tp + Vj> 

< 27) + 2 Vj> 

< 4 max {7) , } . 

Finally, for proving the statement in the last claim is enough to show that no 
demand point j could increase simultaneously two values U t j,ti s j, for ii ^ i s and 
ii,i s & Pi- This follows from the definition of p*, which implies that whenever a 
demand point reaches a facility on p, ; , the predecessor of that facility should have 
been already paid, and subsequently all the facilities of p, situated on inferior 
levels. □ 

We now describe how to construct a corresponding primal solution ( x,y ). 
Suppose there are r fully paid facilities in the last level. Order them according 
to nondecreasing T— values, say 

Ti < ... <T r . 

Construct greedily a set C C V) of centers which have parewise disjoint 
neighborhoods and assign each j G 77 to some center io G C as follows: 

INITIALIZE C = 0 
FOR * = 1, . . . , r DO 

IF Ni 0 N io ^ 0 for some i 0 < i, assign to p io all demand nodes 
j G D with i G Pj 

ELSE C = CU{i} and assign to p, all the demand nodes j G D 
with the property that i G Pj 



The paths p,; (i G C) are called central paths. 
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Remark 3. Note that each demand point j is assigned to one center. Further- 
more, by construction of C, j ’’contributes” to at most one central path (not 
necessarily the one to which it is assigned). 



The primal solution (x, y) is obtained by connecting all demand nodes along 
their corresponding central paths: 



x j P 



1 if p = pi and j was assigned to Pi 
0 otherwise 



and 



{ 1 if i is on a central path 
0 otherwise 

The shipping cost c (x) is easily bounded as follows. 

If j € D is assigned to Pi 0 then T io < 7), where {*} = Pj fl V), . Due to Lemma 
1 (ii) and (v), we get T, 0 < Vj and 



c iPi 0 



< C A, 



V 3i 0 Pi 0 



< 4 Vj + T io < 5 Vj . 



The cost of opening facilities along a central path p io can be also bounded 
with the help of Lemma l(vi) 



*e Pi 0 i£Pi 0 j£Ni jeNi 

Since the centers have pairwise disjoint neighborhoods, we further conclude 
that 



f(y) = H Y. h-Y. v i ■ 

i 0 eCiepi 0 jGD 



We have proved 

Theorem 1. The above primal solution ( x,y ) satisfies 

c{x) + f(y) <6^2^ . 

j£D 



4 A Capacitated Version 

The following capacitated version has been considered in the literature: Each 
i € F has an associated node capacity Ui € N which is an upper bound on the 
number of paths using i. On the other hand, we are allowed to open as many 
copies of i (at cost /,; each) as needed. 
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To formulate this, we replace the 0 — 1 variables j/i in (Pint) by nonnega- 
tive integer variables ?/,; £ Z + , indicating the number of open copies of i £ F. 
Furthermore, we add capacity constraints 

EE Xj P < Uiyi , for each i £ F . (4-1) 

j£D pBi 



Again, we let C-lp denote the optimum value of the corresponding LP-relaxation. 

The idea to approach the capacitated case (also implicit in [ JV99] for the 1- 
level case) is to move the capacity constraints to the objective using Lagrangian 
multipliers \ > 0, for each i £ F. This results in an uncapacitated problem 



C( A) := minimize c( x) + f(y) + EM EE X jp UiPi I 

i£F \j£D pBi ) 

= minimize c( x) + f(y) 

with fi = fi — Xi'Ui , for each i £ F and c(e) = c(e) + Xi if i is the endpoint of 
e £ E. Note that each A > 0 gives C(X) < Clp- 

As in section 3. we compute a primal 0 — 1 solution (x, y) of C( X) with 

c(x) + J(y) < 6C(A) . 

Note that this does not necessarily satisfy the capacity constraints (4.1). 
However, a clever choice of the Lagrangian multipliers A i = ^— (i £ F) yields 

c(x) + f(y) = c(x) + ^E^EE% + ^E /»2/» 

i&F 1 p3i jGD i£F 

> c(x) + , 

i£F 



where yi := ~ X! S x jp opens each facility i £ F sufficiently many times. 

pBi j£D 

Hence (x,y) is indeed a feasible solution of the capacitated problem satisfying 
c(x) + \f(y) < 6C(A) < 6 C LP , 



hence 



c(x) + f(y) < 12 C LP . 



Theorem 2. Our greedy dual ascent method yields a 12 — approximation of the 
multilevel capacitated facility location problem. 



A Simple Dual Ascent Algorithm 



63 



References 

ACS99. Aardal,K.I., Chudak, F., Shmoys, D.B.: A 3-approximation algorithm for 
the k-level uncapacitated facility location problem. Information Processing 
Letters, 72 , (1999), 161-167 

CNW90. Cornuejols, G., Nemhauser, G. L., Wolsey, L. A.: The uncapacitated facil- 
ity location problem. In P. Mirchandani and R. Francis, editors, Discrete 
Location Theory, John Wiley and Sons, New York, (1990), 119-171 

GK98. Guha, S., Khullcr,S.: Greedy strikes back: improved facility algorithms. Pro- 
ceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms, 
(1998), 649-657 

JV99. Jain, K., Vazirani,V.V.: Primal-dual approximation algorithms for metric 
facility location and k— median problems. Proceedings of the 40th Annual 
IEEE Symposium on Foundations of Computer Science, (1999) 

MMP00. Meyerson, A., Munagala, K., Plotkin S.: Cost distance: Two metric network 
design. Proceedings of the 41th IEEE Symposium on Foundation of Com- 
puter Science, (2000) 

STA97. Shmoys, D. B., Tardos, E. Aardal, K. I.: Approximation algorithms for facil- 
ity location problems. Proceedings of the 29tli Annual ACM Symposium on 
Theory of Computing, (1997), 265-274 

S97. Sviridenko, M.: Personal communication, (1997) 




Approximation Schemes 
for Ordered Vector Packing Problems 



Alberto Caprara 1 ’*, Hans Kellerer 2 , and Ulrich Pferschy 2 



1 DEIS, University of Bologna, viale Risorgimento 2, 1-40136 Bologna, Italy 
acapraraOdeis . unibo . it 

2 University of Graz, Department of Statistics and Operations Research, 
Universitatsstr. 15, A-8010 Graz, Austria 
{hans .kellerer ,pf erschy}@uni-graz . at 



Abstract. In this paper we deal with the d-dimensional vector packing 
problem, which is a generalization of the classical bin packing problem in 
which each item has d distinct weights and each bin has d corresponding 
capacities. We address the case in which the vectors of weights associ- 
ated with the items are totally ordered, i.e. , given any two weight vectors 
di, dj, either at is componentwise not smaller than a,j or a,j is componen- 
twise not smaller than a;, and construct an asymptotic polynomial-time 
approximation scheme for this case. As a corollary, we also obtain such 
a scheme for the bin packing problem with cardinality constraint, whose 
existence was an open question to the best of our knowledge. 

We also extend the result to instances with constant Dilworth number, 
i.e. instances where the set of items can be partitioned into a constant 
number of totally ordered subsets. We use ideas from classical and re- 
cent approximation schemes for related problems, as well as a nontrivial 
procedure to round an LP solution associated with the packing of the 
small items. 



1 Introduction 

In the classical one dimensional bin packing problem (BP) we are given n items, 
the *-th item having a weight at € (0, 1] for i = 1 , ... ,n, and an infinite number 
of unit capacity bins. The goal is to assign each item to a bin so that the sum 
of the weights in each bin does not exceed the capacity and the total number 
of nonempty bins is minimized. The d-dimensional vector packing problem (VP) 
is the generalization of BP in which every item i instead of just one weight has 
d weights aj,af, ... ,af and every bin has d corresponding capacities all equal 
to 1 (possibly after scaling). For notational convenience we simply use aj which 
should not be confused with the j-th power of a*. We let m := (a*, a 2 , . . . , af) 
denote the weight vector associated with item i. This generalization was first 
introduced in [4| and then studied by various authors, being both of practical 
and theoretical interest. 

* The work of the first author was partially supported by CNR and MURST, Italy. 



M. Goemans et al. (Eds.): APPROX- RANDOM 2001, LNCS 2129, pp. 63 4751 2001. 
(c) Springer- Verlag Berlin Heidelberg 2001 
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A relevant special case of VP arises when d = 2 and the item weights on the 
second dimension are identical, say of = 1/fc for all i. In this case, we have a BP 
with an upper bound fc on the number of items which can be put together into 
one bin, called the k-item bin packing problem (fcBP). This problem was first 
treated in 1975 by Krause, Shen and Schwetman [7]. In their paper fcBP appears 
as a formulation of a task-scheduling problem on a multiprogramming computer 
system, which is still an important part of every operating system for multiple 
processor machines 25 years later. Further details are given in JS] and .7]. 

Also the general 2-dimensional vector packing problem has many practical 
applications, some of which are mentioned in 0 , m and pQ, where heuristics 
and exact algorithms are proposed. 

An illustrative example of an important application of 2-dimensional vector 
packing arises in the logistics of cargo airplanes. In a real-world problem from a 
major cargo airline we are given a set of n transportation requests for a certain 
very busy flight route, e.g. Frankfurt - New York. Each such request basically 
consists of a single freight piece or package. The obvious objective of the freight 
disponent is to find an assignment of the packages to planes such that all packages 
can be transported on a minimal number of planes. Assuming that all available 
planes have the same capacity is quite common, since many airlines try to use 
homogenuos fleets for certain routes. Moreover, each plane usually makes the 
round-trip several times such that a demand of e.g. 8 planes may be realized by 
using only 2 planes each making the trip four times. 

For each package i we are given two parameters, its weight aj and its volume 
a\. Although an optimal packing of the shapes of the items to be transported 
would be a three-dimensional packing problem, in practice the actual loading 
of the items assigned to each plane is done by experienced workers without 
mathematical optimization of the loading pattern. For practical purposes it is 
sufficient to ensure that the volume of each (usually rectangular) package does 
not exceed a given upper bound, which is naturally smaller than the actual 
volume available on the plane. The upper bound on the total weight in each 
plane is given by its pay load. Clearly, the resulting optimization problem is a 
special case of VP with d = 2. In practice, packages from the same customer 
share the main characteristics of weight and volume, in the sense that a heavier 
package also requires more volume. 



We will first consider the so-called ordered VP. To this end, we define a 
natural order relation on the vectors in R rf and write u ^ v for any two vectors 
u, v in the same space if u is at least as large as v in every component. The 
ordered VP is the special case of VP in which, for any two items i,j either 
ai aj or aj >z a,i holds, i.e. the relation “V” defines a total ordering on the 
set of items. Note that fcBP is a special example of the ordered 2-dimensional 
VP. Furthermore, we can introduce a cardinality constraint for VP analogous to 
fcBP by requiring that at most k items may be packed into every bin. This can 
be modeled by extending VP to a (d + l)-dimensional vector packing problem 
where af +1 = 1/k for all i. Clearly, an ordered VP is still ordered also after 
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adding this additional dimension. Hence, all results derived in this paper for the 
ordered VP also hold for the corresponding cardinality constrained problem. 

Clearly, the item set N of every general VP can be partitioned into subsets 
N 1 , . . . , N c such that, for every subset IV- 7 and every two items i,h £ IV- 7 , either 
cti >z ah or ah h (it holds, i.e. defines a total ordering on each subset 
N 3 . The Dilworth number of a VP instance is the minimum c for which such 
a partitioning exists. Of course, an ordered VP instance has Dilworth number 
1. The Dilworth number is equal to the “classical” Dilworth number for the 
partially ordered set given by the item weight vectors and the order relation 
“>y” . Hence, the Dilworth number of a VP instance can be computed efficiently 
by max-flow techniques (see e.g. 0). We will also show in this paper how to 
extend our results to VP with constant Dilworth number. The discussion above 
points out that VP instances arising from real-world applications often have a 
relatively small Dilworth number. 

All the problems mentioned above are generalizations of BP and hence NP- 
hard in the strong sense. Moreover, it is well-known that no polynomial time 
approximation algorithm with a worst-case performance ratio better than 3/2 
can exist for BP, unless P=NP. However, Fernandez de la Vega and Lueker [3] 
gave an asymptotic polynomial time approximation scheme (APTAS) for BP. 
Their method can easily be extended to an asymptotic (g? + ^-approximation al- 
gorithm for VP. Recently, Chekuri and Khanna |Z] improved this result proposing 
a polynomial-time algorithm with worst-case performance ratio l+ed+0(ln e _1 ), 
for any e > 0, which leads to a polynomial-time algorithm with worst-case perfor- 
mance ratio O(lnd) when the dimension d is fixed. The existence of an APTAS, 
even for the case d = 2, was recently ruled out by Woeginger m (on the assump- 
tion Py^NP). For fcBP, to the best of our knowledge, the best known asymptotic 
approximation ratio achievable is 3/2, as given in j6|. 

Our main result is showing the existence of an APTAS for the ordered VP 
in Section El As a corollary, we get an APTAS for &BP. Our scheme is based on 
standard techniques such as small items elimination and item grouping (cf. [3]), 
but also on the enumeration of the solutions for the instance obtained from 
grouping and on the use of an LP to include the small items for each solution. 
These latter techniques are similar to those used in [Z] for multiprocessor vector 
scheduling (a problem related to VP in which the number of bins is fixed, all 
bins must have the same capacity b on all dimensions, and one would like to 
minimize the value of b so as to pack all items). Nevertheless, while the rounding 
of the LP solution is trivial for the case of the constant dimension, see [2J, when 
the dimension is a part of the input rounding has to be done carefully to achieve 
an APTAS, as we will show in Section El In Section [3] we will show how our 
approach can be extended to VP with constant Dilworth number. An immediate 
corollary is an APTAS for the special case of a 2-dimensional VP where the 
number of different item weights in one dimension is bounded by a constant. 

In the conclusions of m it is mentioned that “a slight modification of the 
method of Fernandez de la Vega and Lueker [3] yields asymptotic polynomial 
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time approximation schemes for the subproblems of d-dimensional vector packing 
with constant Dilworth number”. This would cover the results presented in this 
paper (at least for a constant dimension - it is not clear whether the above 
sentence refers to a constant dimension or not). Nevertheless, to the best of our 
understanding, in order to achieve our results a substantial modification of the 
method in [3] is necessary, along the lines presented in this paper. In particular, 
even if a near-optimal solution for the large items (see the next section) in 
an ordered VP instance is easy to achieve following the approach in [3] , it is 
not always possible to extend this solution to a near-optimal one for the overall 
instance, including the small items. (Certainly a greedy approach does not work, 
as it may end up with bins almost filled in different dimensions.) In fact, the 
packing of the large items can hardly be done independently of the small items, 
as it is done in [3] and other similar approaches, to achieve near-optimality. 

Consider for example the /cBP instance with k = 5 and n = 2s + 8s items 
(where s is a positive integer), 2 s with weight 1/2 and 8s with weight 1/8. 
Clearly, the optimal solution packs one item of weight 1/2 and four items of 
weight 1/8 per bin, requiring a total of 2s bins. If the small items are those 
with weight 1/8, the optimal packing of the large items, requiring s bins, is 
obtained by packing two items of weight 1/2 in each bin. Then, the remaining 
small items must be packed into separate bins, requiring (//] additional bins. 
If we replace 8 by 2 P for some integer p > 3 and let k = 2 P_1 + 1, assuming 
the small items are those with weight 1/2 P , the optimal solution still needs 2s 
bins, whereas the solution obtained by optimally packing the large items first 



uses s 



2 * ft 

2P-! + l 



bins, which goes to 3s as p goes to infinity. Accordingly, the 

worst-case approximation guarantee cannot be better than 3/2 (even for fcBP) 
if the large items are packed first. 



Although we have already used the notion of approximation algorithm 
throughout the Introduction, we will conclude this section with a precise def- 
inition of approximation algorithms used in the paper. These algorithms are 
generally classified by their worst-case ratio: For any VP algorithm A let C A (I) 
denote the number of bins used by algorithm A, and let C OPT (/) denote the 

minimum (optimum) number of bins required to pack the items of a given in- 

stance I. Then the asymptotic worst-case performance ratio is defined as 

[) C A (I) 

R * = cony) • 

Observe that Ra < Ki if there are two constants Ki and K 2 such that for every 
problem instance I 

C A (I) < KiC opt (I) + K 2 . 

Moreover, Ra is an absolute worst-case ratio if K 2 = 0. Consider a value e £ 
(0, 1). We say that A is an asymptotic (1 + e )- approximation algorithm if Ra < 
(1 + e). An APTAS is an asymptotic (1 + ^-approximation algorithm for any 
e > 0 with running time polynomial in the size of the encoded instance. 
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2 An APTAS for the Ordered VP 

This main section contains the description of an APTAS for the ordered VP 
both for the fixed dimension and for the case where d is a part of the input. Let 
s be the required accuracy. We will assume e < 1/2. As it is usually the case 
for packing problems we will distinguish between small and large items. Let us 
define for the given value e € (0, 1/2) 

L := {i | a/ > e for some k £ (1, . . . , d}}, £ := \L\, large items 

S := {* | a/ < e for all k £ (1, . . . , d}}, s := |S|, small items. 

For simplicity we assume that L = {1, ...,£} with a\ «2 ^ t a^. 



2.1 Enumerating Packings for the Large Items 

Let us first assume that £ > 2/e 2 . Later on we will also illustrate how to handle 
the (simpler) case £ < 2/e 2 . We introduce item grouping to attain a simplified 
structure of packings for the large items, so that only a polynomial number of 
different packings have to be considered. In particular, we group the items in L 
as follows. Define 

h := [i£ 2 \ 

and let p and q be such that £ = ph + q and 1 < q < h. Note that h > 2. We 
now show that 

P<y 2 (!) 
£ z 

and hence p is bounded by a constant. If (JT]) were not true we would have the 
contradiction 



£= ph + q> -2 |_A 2 J > -2 (£s 2 — !)=£+{£ — 



> 



because £ > 2/e 2 . 

We partition L into the p + 1 subsets L, := {ih + 1, ...,(* + l)h}, i = 
0, . . . ,p — 1, each containing h items and L p := {ph + 1, . . . , £}. Note that each 
bin in a feasible solution contains at most [-J items from L , since ag has a weight 
of at least e in at least one dimension and all other items are not smaller in this 
dimension. Accordingly, we let a packing vector t be a (p+ l)-dimensional vector 
with ti < [/J for i = 0, . . . ,p, where ti denotes the number of items from Li. 
We observe that, while all sets of large items that fit in a bin correspond to a 
packing vector, the converse does not hold. 

The total number / of all possible packing vectors t with £j < for all i is 
clearly bounded by 



/< 





p+i 



since each of the p + 1 components of t can be chosen between 0 and [ . Hence, 
/ is constant for any fixed £. 
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In our approximation scheme we compute all possible combinations of pack- 
ing vectors to the bins. To this end, we need to know the number m = C OPT (I) 
of bins used by the optimal solution. This can be done either by trying all the 
possible values of to , between 1 and n, or by performing binary search. For the 
given to, we consider all possible assignments of packing vectors to the to bins. 
To get a very rough upper bound on the number of such assignments, consider 
that each of the / packing vectors can be selected up to to times. Hence, we 
can count for every packing vector the number of times it is assigned to a bin. 
Generating / such numbers between 0 and m yields a total of (m+ 1)^ possible 
assignments which is polynomial in to (and n). Note that we do not have to 
identify the particular bin a vector is assigned to. 

An assignment is called feasible if the sum over the All components of all to 
packing vectors is equal to |L,| for * = 0, 1, . . . ,p. Note that in this way we will 
also consider the optimal assignment, i.e. the assignment that associates with 
each bin exactly the packing vector defined by this bin in the optimal solution. 

We can define for every packing vector t an induced packing of items in L into 
a bin (not necessarily feasible) by packing up to ti arbitrary items from set Ti+i 
for i < p. The entry t p of the packing vector is ignored in the induced packing. 
If the packing vector t corresponds to a feasible packing then also the packing 
induced by t is feasible, since every item k € Li in the feasible packing is replaced 
in the induced packing either by no item or by an arbitrary item j £ Li + \ such 
that aj A afc. Note that a feasible assignment may involve a packing vector 
corresponding to an infeasible bin. However, the crucial point that guarantees 
that our approach is correct is that a packing vector corresponding to a feasible 
bin gives rise to an induced packing which is always feasible. 

For every feasible assignment considered, we generate the corresponding in- 
duced packing for all bins. Recall that this means that for every bin with 
an assigned packing vector t we pack up to t t arbitrary items from L,+i for 
* = 0, . . . ,p— 1. Since the assignment is feasible, it is easy to generate the induced 
packing so that all items in L\, L ^, . . . , L p are packed, noting that \L.f\ = |T,:+i| 
for * = 0, ... ,p — 2 and |T p _i| > \L p \. If in the thereby generated packing the 
capacity of some bin is exceeded, the feasible assignment we started from can- 
not be the optimal one, and we simply discard it. Otherwise, i.e. if all the to 
d-dimensional capacity constraints are fulfilled, let bj = (bj, , l/j) be the resid- 

ual capacity vector of bin j for j = 1, . . . ,m. By construction of the packing 
induced by the packing vectors of the optimal assignment, the values bj of this 
packing are componentwise at least as large as the residual capacities in the 
optimal solution for each bin j. 

The items in Lq are left unpacked and are put each into a separate bin, which 
requires h extra bins. A trivial bound on the total weight of large items yields 
to > \£e\ since, as mentioned before, in at least one dimension item ae and also 
all other large items have a weight at least e. Hence, we get 

h = [fe 2 J < [fe] £ < me. 

Summarizing, since one of all the assignments derived from the considered as- 
signments of packing vectors to bins corresponds to the optimal packing, we 
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managed to pack all large items into m + me bins while leaving in every bin a 
residual capacity for the small items at least as large as in the optimal solution. 

We now consider the case £ < 2/e 2 . In this case, the number of large items is 
constant and, assuming we know the number m of bins in an optimal solution, 
we enumerate all the 0(m e ) feasible packings of the large items into the bins, 
i.e. packings fulfilling the capacity constraints in every dimension, among which 
is the optimal assignment. 

Summarizing, we have proved 

Theorem 1. For any fixed e £ (0,1/2), there is a (1 + e)- approximation algo- 
rithm for the ordered VP if all items are large with respect to e. □ 



2.2 Packing the Small Items 

For each feasible packing of large items generated above in either case, we pack 
the small items. First we solve the following LP. 

For each small item i £ S and bin j = 1 ,m let x ij £ [0, 1] denote the 
(fractional) part of item i which is packed into bin j. 

m 

y Ixjj = 1, i£S, (2) 

3 = 1 

y^a'fXij <b k j: j = 1,.. .,m,k = 1,. . . ,d, (3) 

ieS 

Xij > 0, i £ S, j = 1, . . . ,m. (4) 



Note that we are only interested in a feasible solution to this set of linear in- 
equalities. To formally define an LP any objective function may be added. 

If the LP has no feasible solution, the feasible assignment cannot correspond 
to the optimal assignment by the above discussion, and this assignment is dis- 
carded. Otherwise, a basic feasible solution of the LP contains at most s + dm 
nonzero variables, and a straightforward counting argument (also given in ED 
shows that the number of items that are not packed into a single bin is at most 
dm. These items can be put into separate bins. Recalling that a k < e for each 



i £ S and k = 1 , ... ,d, this requires no more than 



dm 

LIJ 



bins, which can be 



bounded from above with a simple calculation by 2 dme + 1 for e < 1/2. 

Accordingly, the solution found for the correct value of m and the optimal 
assignment has a value bounded by 



m + me + 




< m + (2d + 1 )me + 1, 



which proves 
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Theorem 2. There is an APTAS for the ordered VP if the dimension d is 
constant. □ 

Corollary 1. There is an APTAS for kBP. □ 

2.3 Rounding the LP Solution for the General Dimension 

The case in which d is a part of the input needs a more careful approach. Let x* 
be an optimal solution of the above LP. Our final aim is to round this solution 
in a way such that only 2m items remain unpacked in bins l,...,m. By the 
discussion above, packing these items in additional bins in a greedy way will 
require no more than 4 me + 1 additional bins, yielding an APTAS also for the 
case of the general dimension d. 

To simplify the notation, we will assume that no small item is entirely packed 
into one bin, i.e. that no entry of x* is equal to one. The generalization to the 
case in which x* 7 = 1 for some i,j is obvious by fixing item i into bin j and 
considering only the packing of the fractional items. 

Let S = {1, . . . , s} with ai ^ <22 ^ a s - We will pack the items in S in 

increasing order into the m given bins until we “get stuck”. Lemma [T] will state 
that at this point there are at most 2 to small items left unpacked. In principle, 
in order to pack some item i we will try to pick a bin (which possibly already 
contains a fractional part of i) and pack item i completely into that bin. To 
make room for item i we will “throw out” the fractional parts of larger items 
i + 1 , i + 2, . . . as far as necessary. 

More formally, we will derive step by step a rounded solution x from the 
original LP solution x*. Initially, we set x := x* . We pack the smallest item 1 as 
follows. If there are bins j such that x ij = "YhieS x ij < 1> we se t x ij := 0 
for i £ S, i.e. we do not consider these bins to pack any item because it may 
happen that no item fits completely into such a bin. Then we find an (arbitrarily 
chosen) bin j such that Xij > 0, letting fi := 1. If no such bin exists (which is 
possible due to the fact that some bins are not considered) we choose a bin j 
such that Xf 1 j > 0 and /1 is as small as possible. (If all entries in x are 0, item 1 
as well as all other small items are unpacked, see below.) We pack item 1 in bin 
j, using the capacity of bin j allocated by the LP solution for items fi, ... ,ei, 
where e\ is the smallest index such that y := Xhj > 1- In our rounded 

solution, this corresponds to setting X\j := 1, Xu '■= 0 for i ^ j , Xhj := 0 for 
1 < h < ei, and x ei j := y — 1. We say that the packing of item 1 starts at level 
/1 and ends at level e\. Note that, if y > 1, not all capacity allocated for item 
ei is used for item 1; in other words we still have a fraction x ei j that may be 
used to pack into bin j items 2, 3, . . . , e\. To be more precise, we formally state 
our rounding procedure in Figure [T] 

If after the bin closing step no bin is left for item i, the remaining items 
i, ... ,s will be packed into separate bins as above. 

The procedure is illustrated in Figure |2]on an example, showing the rounded 
solution x at the beginning as well as after each solution updating step. Among 
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Procedure LP-round 

For each item % = 1, . . . , s perform the following steps: 

Bin closing: 

For each bin j such that ^f h>i Xhj < 1, let Xhj '■= 0 for 
h > i 

(close bin j) 

If all bins are closed then terminate 
Bin selection: 

If Xij > 0 for some bin j then 

Let Xu := 0 for £ ^ j and /; := i 

Else 

Let fi := min{r : r > i and x r k > 0 for some bin k} 
Let j be a bin such that Xf t j > 0 
Solution updating: 

Let Xij := 1 (pack item i in bin j) 

Let e» := min{r : Y7h=i x hj > 1} and y := J2h=i x hj 
Let Xhj i= 0 for h = 1, . . . , e» — 1 and x ei j y — 1 

Fig. 1. The LP rounding procedure. 



6 items, items 1,2, 3,4 are packed, respectively, into bins 1,2,3, 3, with /i = 
1, e\ = 4; /2 = 2, e 2 = 6; / 3 = 3, e 3 = 4; / 4 = 5, e 4 = 6. 

Lemma 1. The number of small items not packed into the first m bins at the 
end of the rounding procedure is at most 2 in. 

Proof. The number of small items packed by the rounded solution into bins 
1, . . . , m, is equal to a := Sj=i x ij- We will compare this value to s = 

J2ies X)j= l x*j. Note that er is initially equal to s and may be decreased only in 
the bin closing step and in the bin selection step, whereas it is unchanged in the 
solution updating step. It is immediate to see that the overall decrease in all bin 
closing steps is at most m. In the following we will bound the overall decrease 
in all bin selection steps. 

Noting that there may be some decrease in the bin selection step for item i 
only if fi = i, let k be the last such that fk = k, i.e. fi > i for i > k. If k < m, 
then the overall decrease in all bin selection steps is not larger than m, as the 
decrease in each step is not larger than 1. 

We now consider the case k > to. The definition of / ensures fi < f 2 < 
• • • < fk- Hence, since fk = k, the packing of all items in 1 ,k starts at a 
level not larger than k, i.e. f t < k for i = 1, . . . , k. Clearly, the number of items 
i with fi < k whose packing ends at a level larger than k, i.e. e* > k, cannot be 
larger than to, the number of bins available. This means that < k for at least 
k — to items among 1 , ,k. Now, since the decrease of er during the bin selection 
steps in the first k iterations is only due to changing entries x-ij with i < k, this 
decrease is at most to since X^ =1 x ij = an( i Xu=i Y^jLi x ij > k—m (at 
least k — in items among 1 , ,k are packed by the rounding procedure). After 
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Fig. 2. Illustration of the rounding procedure. 



packing k no further decrease takes place in the bin selection step, so the total 
decrease in this step is at most m. 

In both cases, we have an overall decrease not larger than 2m, i.e. the number 
of small items unpacked at the end of the rounding step does not exceed 2m. □ 

The discussion above shows 



Theorem 3. There is an APTAS for the ordered VP. □ 

As already mentioned, adding a cardinality constraint to an ordered VP instance 
yields another ordered VP instance. This shows 

Corollary 2. There is an APTAS for the k-item ordered VP. □ 

We outline our APTAS in Figure It should be noted that efficiency was 
not an objective in the design of the algorithm. Its aim is rather to illustrate the 
existence of an APTAS for the ordered VP. 

In particular, the practical inefficiency is due to the complete enumeration 
of the feasible assignments of large items to the bins. Although the inefficiency 
in handling large items is common to all approximation schemes for packing 

i 12 

problems, in our scheme the number of feasible assignments is (roughly) nP , 
for each of them we solve an LP, etc., whereas most of the schemes for easier 
packing problems have linear running time for fixed e. 

Although this makes the practical application of our approximation scheme 
very unlikely, some of the ideas may be useful within the design of practical 
heurstics. For instance, the separation between “large” and “small” items, pack- 
ing the large ones first (by some reasonable heuristic instead of almost complete 
enumeration) and then the small ones (possibly by LP and rounding) may turn 
out to be effective in practice. 



Approximation Schemes for Ordered Vector Packing Problems 



73 



Algorithm APTAS 

Partition the item set into S and L 

For all m = 1, . . . , n (possible number of bins in the optimal solution) 

If l > 2/e 2 then 

Partition L into Lo , . . . , L p - 1 (each containing [|L|e 2 J items) and 

L P 

(containing \L\ —p [|Z/|e 2 J items) 

Pack the items in Lo into |Lo| bins 

For each feasible assignment of packing vectors to the m bins 
Pack the items in Lo, , L p - 1 into the m bins 
as induced by the packing vectors 
If the packing generated is feasible then 

Pack the items in L p into \L P \ extra bins 
Solve the LP corresponding to the packing of the items 
in S into the m bins 
If the LP has a feasible solution then 

Pack the items in S according to the LP solution, 
using extra bins for the unpacked small items 
Possibly update the best solution found so far 

Else (£ < 2/e 2 ) 

For each feasible packing of the large items to the m bins 

Solve the LP corresponding to the packing of the items in S 
into the m bins 

If the LP has a feasible solution then 

Pack the items in S according to the LP solution, 
using extra bins for the unpacked small items 
Possibly update the best solution found so far 

Fig. 3. Outline of the APTAS for the ordered VP. 



3 Extension to Bounded Dilworth Number 

Let TV 1 , . . . , TV C be the partitioning of the item set TV into the minimum number 
of ordered sets as described in the Introduction. Throughout the section, we will 
consider the case in which c is a constant. 

The APTAS for this problem is derived along the same lines as the approxi- 
mation scheme in Section |2 We treat the items in each set TV - 7 in the same way as 
we treated the whole item set TV in the previous section. Namely, we consider the 
set L 3 of the large items in TV- 7 , i.e. items with > e for some k e {1, . . . , d}. 
Let l 3 := | L 3 \. If £ 3 < 2/e 2 , there is only a constant number of large weight 
items in L J . Otherwise, we perform item grouping again and partition L J into a 
constant number p 3 + 1 of subsets containing (again except the last one) [Fe 2 J 
consecutive items. The small items from all sets TV - 7 will be considered later. 

Now we consider all packing vectors corresponding to the packing of large 
items into bins. Since the number of sets L 3 is at most c and each of them is either 
partitioned into a constant number p 3 + 1 of subsets with consecutive items, or 
contains only a constant number of items, a packing vector has constant length. 
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More precisely, now a packing vector has pi + 1 entries for each set Ni which 
is partitioned into pi + 1 subsets, each telling the number of large items packed 
from each subset. Moreover, a packing vector has P entries for each set Ni 
with a constant number of large items, and has value 0 or 1 indicating whether 
the corresponding item is packed or not. Obviously, every entry of a packing 
vector is again at most [-J . Therefore, the overall number of packing vectors is 
again bounded by a constant and the number of possible assignments of packing 
vectors to the m bins is bounded by a polynomial in to. 

The induced packing can be constructed from each generated packing vector 
in the same way as described in Section 12.11 (of course only for the components 
corresponding to sets Ni with P > 2/e 2 ). As before we consider only feasible 
assignments of packing vectors to bins which leaves again at most [Fe 2 ] items 
unpacked for every set Ni . Packing all these items into separate bins requires 
at most erne extra bins by employing a trivial weight bound separately on each 
set Ni. 

There remain the small items to be packed for every generated feasible pack- 
ing. This is done again by the same LP as in the previous section. The fractional 
solution values are rounded separately for each set Ni . By the argument of Sec- 
tion E3 1 we now have at most 2 cm items which are not packed into the first 
to bins, and we can pack any [/J of them together in one bin thus requiring 



as before at most 



uj 



bins. Overall, the number of bins used is at most 



to + cme + Acme + 1 , which shows the following 



Theorem 4. There is an APTAS for VP with constant Dilworth number. □ 
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Abstract. We introduce the notion of incremental codes. Unlike a reg- 
ular code of a given rate, which is an unordered set of elements with a 
large minimum distance, an incremental code is an ordered vector of ele- 
ments each of whose prefixes is a good regular code (of the corresponding 
rate). Additionally, while the quality of a regular code is measured by its 
minimum distance, we measure the quality of an incremental code C by 
its competitive ratio A: the minimum distance of each prefix of C has to 
be at most a factor of A smaller than the minimum distance of the best 
regular code of the same rate. 

We first consider incremental codes over an arbitrary compact metric 
space M, and construct a 2-competitive code for M. When M is fi- 
nite, the construction takes time 0(|A/| 2 ), exhausts the entire space, 
and is NP-hard to improve in general. We then concentrate on 2 spe- 
cific spaces: the real interval [0, 1] and, most importantly, the Hamming 
space F n . For the interval [0,1] we construct an optimal (infinite) code 
of competitive ratio In 4 ~ 1.386. For the Hamming space F n (where the 
generic 2-competitive constructive is not efficient), we show the follow- 
ing. If |.F | > q, we construct optimal (and efficient) 1-competitive code 
that exhausts F n (has rate 1). For small alphabets (|J ? | < q), we show 
that 1-competitive codes do not exist and provide several efficient con- 
structions of codes achieving constant competitive ratios. In particular, 
our best construction has rate (1 — o(l)) and competitive ratio (2 + o(l)), 
essentially matching the bounds in the generic construction. 



1 Motivating Example 

Imagine the following problem which was actually given to one of the authors. 
An Internet company wants to assign account numbers to its customers when the 
latter shop on-line. An account number allows the customer to check the status 
of the order, get customer support, etc. In particular, the customer can enter 
it over the phone. Because of that and several other reasons, account numbers 
should not be too long. On the other hand, we would like account numbers to be 
somewhat far from each other, so that it is unlikely for the customer to access a 
valid number by mis-entering few digits. One way to achieve this would be to use 
an error-correcting code of reasonable minimum distance (for example, a random 
account number might work for a while). This has two problems, however. First, 
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good distance implies not very good rate, and since the account numbers are 
quite short, we “waste” a lot possible account numbers, and exhaust our small 
account space too quickly (thus, losing customers). Secondly, when the number 
of customers is small, the corresponding prefix of our code is not as good as we 
could have made it with so few account numbers. 

We propose a much better solution to this problem, namely, to use an incre- 
mental code. Such a code will eventually exhaust (or nearly exhaust) the whole 
space. Indeed, when the number of customers is huge, we prefer to have close 
accounts numbers rather than to lose customers. On the other hand, when the 
number of customers is i, incremental code guarantees that the first i account 
numbers we assign will be almost as far from each other as any possible i ac- 
count numbers could be! In other words, the minimal distance of larger and 
larger prefixes of the code slowly decreases at an almost optimal pace. 

Notice that while it is customary to measure regular codes of a given rate in 
terms of their minimum distance, a more relevant measure of incremental codes 
is the relative behavior of minimal distance on larger and larger prefixes. This 
leads to the notion of a competitive ratio of an incremental code C. Namely, C 
is A-competitive if the minimum distance of each prefix of C is at most A times 
smaller than that of the optimum code of the same rate. 

Organization. While the main motivation for incremental codes comes from 
the Hamming spaces, we start in Section |2] by defining and studying the cor- 
responding notion on arbitrary metric spaces. In particular, we obtain a 2- 
competitive codes for any such metric space, and show that it is NP-hard (in 
general) to beat this competitive ratio. We then study two specific spaces. In 
Section Q3 we construct an optimal 1.386-competitive incremental code for the 
real interval [0,1]. Finally, in Section U we study the most important Hamming 
space F n (where the generic construction is inefficient). In Section H.ll we give 
a simple and efficient 1-competitive code for the Hamming space over moderate 
alphabets (in particular solving the “account problem” above). In the remain- 
ing subsections we concentrate on the intricacies of the Hamming space over 
small alphabets. While it is much harder to obtain competitive codes in this 
case (since our understanding of optimal codes is somewhat limited), we give 
several efficient constructions achieving constant competitive ratios. 

2 General Notion and Construction 

For simplicity, we restrict our attention to finite metric spaces, even though 
most results extend to compact metric spaces as well. So let A4 = (M, D) be 
any finite metric space on point set M with metric function D. A (regular) code 
on At is simply a subset of points S C M. The minimum distance d_M (S) of S 
is the smallest pairwise distance between distinct points in S. For an integer i 
we define the quantity opt -dj^(i) to be the largest minimal distance of a code of 
cardinality i: opt -djn(i) = maxi^i—^ dx(iS). 

An incremental code C = (ci . . . Cfc) is an ordered sequence of distinct points 
of M. C is exhaustive if k = \M\, i.e. the code eventually runs through the 
entire space. For every i £ [k] we define the i-tli prefix of C, Ci = {ci . . . Ci}, 
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and view it as a regular (unordered) code of cardinality i. We say that C is A- 
competitive, if for every i G [fc], the i-tli prefix Cj of C forms a code of distance 
at least opt-d^W/A i.e. opt-dx(i) < A ■ d_M(Ci )■ We denote by rj^(C) the 
(best) competitive ratio of C, and by opt-rx(fc) the smallest competitive ratio 
of any incremental code of cardinality k: opt-r^vi (fc) = min|c| = fc rx(C). We define 
opt-rx = opt-rx(|M|), and call it the competitive ratio of AL (We notice that 
since the prefix an ^-competitive incremental code is also A-competitive, we have 
that opt -rj^(k) is a non-decreasing function of k.) We say that an incremental 
code C is perfect if C is 1-competitive, and that the space Ai is incrementally 
perfect if it has an exhaustive 1-competitive code (opt-rvi = 1). 

Theorem 1. 

1. The competitive ratio of any Ai is at most 2: opt-r_vf < 2. Moreover, given 
Ai as an input, one can construct an exhaustive 2-competitive incremental 
code C for Ai in time 0(\AI\ 2 ). In fact, constructing k-prefix of C can be 
done in time 0(k ■ \M\). 

2. There exist Ai with competitive ratio 2. 

3. For any A < 2 and given Ai as an input, it is NP -hard to construct A- 
competitive incremental code for Ai , even when the competitive ratio of Ai 
is known to be 1. In particular, it is NP -hard to approximate the competitive 
ratio of Ai to within a factor less than 2. 

Proof. Given a point p and a finite set of points S, define the distance from p to 
S to be D(p, S) = min ge s D(p , q). We use the following simple greedy algorithm 
for constructing C. 

1. Let Ci be any point of M, and let C\ = {ci}. 

2. For all p € M: set closest(p) = ci. 7. Note, D(p, closest(p)) = D(p,C\) 

3. For k = 2 to \M\: 

— Let Cfc be the furthest point from Ck-i, i.e. maximizing D(ck,Ck- 1 )- 
7, Done in linear time by using D(ck,Ck-i) = Z?(cfc, closest(cfc)) 

- Set C k — {c fc } U Ck-i- 

— For all p G M: if D(p,Ck) < D(p,c\osest(p)), update closest(p) = Cfc. 

7. This insures that D{p,Ck) = D{p, closest(p)) indeed 

4. Output C = (ci 

It is easy to see that each iteration of the greedy algorithm is implemented in 
linear time 0(|M|), justifying the running time. 

Now, take any 2 < k < \M\. The 2-competitiveness of C follows from the two 
claims below. 

Claim 1: {Ck) — D(ck,Ck- i), i.e. the closest pair of points in Ck. includes Cfc. 

Proof: Assume d_M(Ck) = D((k,Cj) < D(ck,Ck~ i), where i < j < k. Then 
D(cj,Cj- 1 ) = D(ci,Cj) < D(ck,Ck- i) < D(ck,Cj- 1 ), i.e. Ck should have been 
added before Cj, a contradiction. □ 

Claim 2: D(ck,Ck- i) > \ ■ opt-dvf (fc). 

Proof: Let bi...bk be the optimum code of cardinality k, i.e. D(bi,bj) > 
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opt -d.M(k) for some i ^ j. Then the k open balls of radius R = ^ • opt-d J vt(fc) 
around the bf s are all disjoint. Hence, at least one of these k balls does not 
contain any of the first (k — 1) selected points C\ . . .Ck- 1- Say this is the ball 
around bj. Hence, D(bj,Ck- 1) > R. But c& is the furthest point from Ck- 1, and, 
therefore, D(ck,Ck- 1) > D{bj,Ck-\) > R. □ 

We next give an example of A4 with opt -tm = 2. Let M = {w, x\, X 2 , yi, y 2 , 
zj, where D(w,Xi) = 1, D(xi,yj) = 2, D(yj,z) = 1, i,j = 1,2, and the other 
distances are the length of the shortest paths induced by the above assignments. 
In particular, the two furthest points are w and 2 of distance 4, and the best 4- 
code is {x\,X 2 ,yi,y 2 } of minimum distance 2. In other words, opt-dx(2) = 4 = 
D(w,z) and opt-c?^((4) = 2 = D(xi,yj), i,j = 1,2. Now, for any incremental 
code C = (ci, C2, C3, C4}, unless C4 = {xi, X 2 , yi, 2/2}, one of the pairwise distances 
in C4 will be 1, giving a gap of 2/1 = 2. On the other hand, if C\ = {xi, X2, 2/1 , 2/2}, 
then D(ci, c 2 ) = 2, giving again a gap of 4/2 = 2 for the 2-prefix of C. 

Finally, we show that it is NP-hard to construct an A-competitive code for 
A < 2 when given A4 as an input, even if opt-r^i = 1. We make a reduction 
from the Maximum Independent Set problem, which is known to be NP- 
complete |O.T79j . Given a graph G = (V,E), we define a metric space A4 = 
(M, d), where M = V and D(i,j) = 1 iff (i,j) € E , and D(i,j) = 2 otherwise. 
Let k be the (unknown) value of the maximum independent set in G. It is easy 
to see that the an optimal incremental code for A4 is 1-competitive, and should 
first list the k elements of any maximum independent set of G (in any order), 
followed by the other elements (in any order). In particular, opt -dM^) = 2 for 
i < k and opt-djvf(i) = 1 otherwise. Moreover, any such 1-competitive code 
induces the maximum independent set by looking at the largest i-prefix with 
minimal distance 2. 

On the other hand, any code for A4 which is not 1-competitive for A4 must be 
2-competitive. Hence, if we have a procedure that can produce an H-competitive 
code for Ai, where A < 2, this procedure must in fact produce an optimal 
1-competitive code. But we just argued that in this case we can compute the 
largest independent set of G, which is NP-hard. 



Remark 1. Notice that the greedy algorithm above is exactly the same as that 
of Gonzalez [US5] for the so called ^-center problem. This is a just a coincidence, 
since our problems and the analysis are quite different. 

Remark 2. While the greedy algorithm is extremely efficient for generic metric 
spaces, we are mainly interested in the Hamming space F n . In this case we 
cannot afford to scan the entire (exponential in n) space to add every new 
point. Therefore, the greedy algorithm is inefficient for the Hamming space. See 
Section [4] for the discussion of efficiency and efficient constructions for F n . 

Extending to compact spaces. Aside from the complexity considerations, 
most of the discussion above (in particular, the greedy algorithm) extends to 
infinite compact metric spaces, like “nice” compact subsets in R n . The main 
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difference is that “exhaustive” codes now become countably infinite codes, and 
we require every finite prefix of such a code to be “^-compatible” w.r.t. the best 
finite code of the same cardinality. We illustrate it in detail in the next section, 
when talking about the real interval [0, 1], 

3 Optimal Code for [0, 1] 

An incremental code over [0, 1] is simply a sequence of points C = (pi,P 2 , ■ ■ •}■ 
If we let q\. . ,q\ denote p\ . . ■ p, in the increasing order, then after i steps [0, 1] 
is split into (i + 1) intervals Iq = [0, q\\, I 2 = [q \ , <$] i ■ • • > h = [<7l , 1] - Clearly, 
the minimal distance of C, is d{Ci) = min(|/i|, . . . , |/j_i|)), while the optimum 
distance is opt-d(z) = l/(z — 1) (by spreading the points uniformly). When adding 
Pi+i we simply subdivide one of the Ij’s into two subintervals. If we assume that 
Pi = 0 and P 2 = 1 (which will happen in our solution and will be the “worst case” 
in the lower bound proof), then the “border” intervals Iq and /,;+i disappear, 
and our objective is to place the points P 3 ,P 4 , ... on [0, 1] in such a manner that 
the length of the smallest interval after each pi is as close to l/(z — 1) as possible. 
We notice that the dual “maximal interval” version of the latter problem - make 
the largest interval as close to 1 j[i — 1) as possible - is a well known dispersion 
problem (see [DT97lC00IM99j 1 . While our lower bound and its proof will be 
somewhat different for our “minimal interval” version, it will turn out that the 
optimal sequence for both versions will be the same, which is not at all clear 
a-priori. 

As a warm-up, let us examine the behavior of the greedy algorithm in Theo- 
rem [l] which is guaranteed to give at least a 2-competitive code. Assume we start 
with ci = 0. The next step makes C 2 = 1, and then we simply keep subdividing 
the largest interval in half. Thus, after 2 k points, all but one interval will be of 
size 1 / 2 fe , but that last interval will be of size l/2 k ~ 1 . Since the optimum for 2 k 
points is l/(2 fe — 1), we get the ratio at least ( 2 k — l)/2 fc_1 — > 2. Thus, greedy 
actually gives a ratio of 2. As we show now, the best (infinite) incremental code 
for [0, 1] actually achieves the ratio of In 4 = log e 4 « 1.386 < 2. 

Let H(k) = (1 + \ + . . . + r) denote the fc-th harmonic series. 

Lemma 1. Unless A > 2 ■ [H(2i) — H(i + 1)], no incremental code of ( 2i + 1) 
points in [0, 1] is A-competitive. 

Proof. Consider a code of 2i + 1 points in [0, 1] with competitive ratio A. For 
every j < 2i + 1, consider the distances between adjacent points after placing 
the first j points. Let l\ < B 2 < . . . be these distances, sorted in increasing 
order. We need the following claim: 

Claim: (? k < (^2 for 1 < k < j — 2. 

Proof: Adding a point (in this case, ( j + l)-st point) can either add one more 
distance to the list of interval distances (if the new point is the rightmost or the 
leftmost), or it can remove one length from the list, replacing it with two others 
(if the new point lies between two old points). In either case, there are at most 
two new lengths that are added to list. This means that among the first k + 2 
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lengths on the new list, there are at least k lengths that were already on the old 
list before we added the last point. Hence, the (fc + 2)-nd smallest length on the 
new list cannot be smaller than the fc-th smallest length on the old list. □ 

By iterating the above claim, we get for all 0 < j < i — 1, £^ t+ ~ 3 < 

Notice, +1_J is the length of the smallest interval after adding (2i + 1 — j) 
points. Since our code is A-competitive (and since the optimal arrangement of 
2i + 1 — j points has distance 1/(2* — j)), we must have 1 < A, which 

means that > £^ +1 ~^ > 1/(A(2* — j)). Summing the last inequality for 

j = 0 ... i — 1, we get 



2^^ Z a \2i + 2i-l 

3=0 V 



2-\ = }-. [H{ 2i)-H{i + l)] (1) 



On the other hand, since l \ < £2+2 \j and all the 2i intervals sum to at most 
1, so we get 



i— 1 i— 1 / «2i+l I «2i+l \ 1 

E^<E 4 ( 2 ) 



3=0 j=o \ / 

Combining Equation 0 and Equation d2]), we get A > 2 ■ [H(2i) — H(i + 1)]. □ 
Since 2 • [H{2i) — H{i + 1)] « 2 In (/pj) l — > In 4, we get 
Corollary 1 . If C is an infinite A-competitive code, then A > In 4 « 1.386. 



We now show an incremental code achieving the bound above. We let pg = 0, 
Pi = 1, and explicitly tell the lengths of the i intervals after the first (z+1) points. 
They are (in increasing order): log 2 (l + 2i=r),log 2 (l + ^2), ■ ■ ■ ,log 2 (l + y)- 
Notice, E}=ilog 2 (l + 2W7) = log 2 (nj=i w/=/r) = lo S2(f) = ^ indeed. Also, 
for * = 1 our only interval is indeed of size 1 = log 2 (l + y). To add the (i + 2)-nd 
point, we subdivide the currently largest interval of size log 2 (l + i) into two 
intervals of sizes log 2 (l + and log 2 (l + jTfi) (again, arithmetic works), as 
claimed. We see that the length of the smallest interval after (* + 1) points is 
log 2 (l + 2 Yu) — 1!, 4 (th e latter is easy to check), proving that this sequence 
has competitive ratio In 4. 

Remark 3. The above argument (construction and the lower bound) can be ad- 
justed to the case of the closed circle S l (where distance is the shortest arc of the 
circle). Any point p on the circle is identified with both 0 and 1 on the interval, 
and then the same code as above is used. Thus, the i intervals we get after i 
points have the same lengths as the i intervals we get after (i + 1) points in the 
[0, l]-construction, but both optima are still l/i, so the ratio is unchanged. 
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4 Hamming Space (Error-Correcting Codes) 

We now turn our attention to the most practically important space - the Ham- 
ming space F n . For that, we first recall some terminology and address some of 
the issues specific to this space. 

Terminology. Recall, the Hamming distance between x, y £ F n is the number 
of positions they disagree. Unless otherwise stated, we will assume that F = [q] 
is a field of size q. A code with K codewords and minimal distance d over F 
is said to have rate r = \og q K/n, dimension k = log q K and relative distance 

5 = d/n. We omit the subscript to the space from the quantities opt- 7 ' and 
opt-d when the Hamming space is clear from the context, and otherwise write 
opt-r(A'; q, n ) and opt -d(K; q, n ) to emphasize the space. We let opt-<5(r; q, n ) = 
1 ■ opt-d(< 7 rn ; q,n) be the largest possible relative distance of a code of rate r 
over F. We let V q (R\ n ) denote the volume of an 77 ,-dimensional sphere of radius 
R in F n , and notice that asymptotically, we have • log g V q {an\ n ) ~ H q (a) = 
a log q {q — 1) — a\og q a — (1 — a)log 9 (l — a), where H q (-) above is the q- ary 
entropy function (in particular, V q (an; n ) < q nH i(°‘l'). Finally, a linear code of 
dimension k is given by the k x n generator matrix G, where the codewords are 
of the form xG for x £ F k . Most codes we will use below are linear. 

What is an “efficient” construction? As opposed to the generic case, 
where we are given the entire metric space as input and need to produce as 
output the “code” itself (as a list of points), in the case of the Hamming space 
we usually think of entire space as being exponential in the relevant parameters, 
and we think of the code as having some implicit small representation. What 
we may require in terms of efficiency is to have a representation of the code 
whose length is polynomial in n and log q , and an efficient procedure that given 
this representation and an index i, produces Cj, the *-th codeword. For example, 
viewed in this light, a random code is not an “efficient construction”, but a 
random linear code is. 

How GOOD is the OPTIMAL CODE? To compute competitive ratios, we need 
to understand the behavior of the optimal code of a given rate. This is well 
known for codes over “large” alphabets (q > n, see Section ITU . For codes over 
small alphabets, we only have bounds on the minimal distance of the optimal 
code, rather than a closed-form formula. Hence, the competitive ratio that we 
can prove depends not only on the performance of the code in question, but also 
on the quality of these bounds. 

On a brighter side, the discrepancy between the known upper- and lower- 
bounds on the optimal distance as a function of rate is at most a small constant 
factor. In fact, the ratio between the Hamming bound (an upper-bound) and 
the Gilbert- Varshamov bound (a lower-bound) is a factor of 2 “in spirit”. To 
see that, recall that the Hamming bound says that for any code with K code- 
words and minimum distance d over [q] n , it holds that K ■ V q (d/2\ n ) < q n , i.e. 
opt -d(I\; q,n) < 2V~ 1 {q n / K\ n). The Gilbert- Varshamov bound, on the other 
hand, says that there exists a Jv-word code with minimum distance d satisfying 
K ■ V q (d; n ) > q n , i.e. opt -d(K; q,n) > V~ l {q n / K\ n). 
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In principle, this means that when we use the Hamming bound as our es- 
timate for the performance of the optimal code, we only lose a factor of two 
(or less). However, notice that when we use that bound, we usually use some 
estimate for V~ x (since working with V ~ 1 itself is too hard), so we may lose 
some small additional factor there (see Section 14.51 for an example) . 

Organization. The rest of this large section is organized as follows. In Sec- 
tion [4J] we consider the case q > n, and show that F n is incrementally perfect, 
i.e. has a 1-competitive exhaustive code. The remaining sections deal with a more 
difficult case of small alphabets (q < n). Section 14721 shows that F n is not incre- 
mentally perfect, while the last three sections deal with efficient constructions 
achieving constant competitive ratios for various settings of parameters. 



4.1 Optimal 1-Competitive Code for Hamming Space F n , |.F| > n 

Let \F\ = q > n. In this case, the well known Reed-Solomon codes are known to 
be the optimal codes for any given dimension k. These linear codes first view the 
elements of F k as polynomials of degree at most (k — 1), and then output the 
values of these polynomial at n arbitrary points of F (here we use q > n). We 
make the following simple observation to turn them into (optimal) incremental 
codes: polynomials of degree at most 0 are special cases of polynomials of degree 
1, which in turn are special cases of polynomials of degree at most 2, and so on. 
We now formalize this idea to get the claimed 1-competitive code. 

Let a± . . . a n be arbitrary distinct elements of F. Given a = (ao . . . afe-i) £ 
F k , assign a polynomial p a {x ) = a iX x of degree at most ( k — 1), and 

output the codeword (p(oti) . . .p(a n )) £ F n . Since any two distinct polynomials 
of degree at most ( k — 1) can agree on at most ( k — 1) points in F, the distance 
of the RS-code at least (in fact, exactly) d = n — k + 1. On the other hand, the 
classical singleton bound says that any code of dimension k must have minimal 
distance at most (n — k + 1), achieved by the corresponding RS-code. 

Now, if we first encode (i.e., evaluate at n points) the polynomials of degree 
0, followed by the polynomials of degree 1 and so on, we see that the minimal 
distance of our now incremental code slowly decreases from n to (n — 1), . . . , all 
the way to 1. Moreover, at the time we are encoding polynomials of degree exactly 
k , our current code’s minimal distance (n — k + 1) is optimal by the singleton 
bound (as we already listed q k_1 polynomials of degree at most ( k — 1)). Thus, 
we showed that the Hamming space F n has a 1-competitive exhaustive code 
C = (co, . . . , c/v— i), where N = q n . Additionally, it is easy to compute Ci given i. 
Indeed, if we write the elements of F as numbers 0, . . . , (q— 1) (0 being the “zero” 
of F ), and then interpret the representation of an integer i £ {0, . . . , TV — 1} base 
|F| as a string a(i ) £ F n , then listing i in the increasing order corresponds to 
the lexicographic order of the a(i)’s, which also lists the polynomials p a (i) hr the 
order of increasing degrees, as needed. To summarize, we get 

Theorem 2. F n is incrementally perfect when |.F| = q > n and F is a field. 
Specifically, letting Ci = (p a (i)(ar), • • • ,p a (i)(&n)) £ F n , the incremental code 



